scylladb

Author	SHA1	Message	Date
Kefu Chai	2583a025fc	s3/test: collect log on exit the temporary directory holding the log file collecting the scylla subprocess's output is specified by the test itself, and it is `test_tempdir`. but unfortunately, cql-pytest/run.py is not aware of this. so `cleanup_all()` is not able to print out the logging messages at exit. as, please note, cql-pytest/run.py always collect "log" file under the directory created using `pid_to_dir()` where pid is the spawned subprocesses. but `object_store/run` uses the main process's pid for its reusable tempdir. so, with this change, we also register a cleanup func to printout the logging message when the test exits. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13647	2023-04-24 13:53:25 +03:00
Pavel Emelyanov	28a01c9e60	Merge 'test: object_store: fix various pylint warnings' from Kefu Chai when reading this source code, there are a handful issues reported by my flycheck plugin. none of them is critical, but better off fixing them. Closes #13612 * github.com:scylladb/scylladb: test: object_store: specify timeout test: object_store: s/exit/sys.exit/ test: object_store: do not declare a global variable for read test: object_store: remove unused imports	2023-04-24 13:45:01 +03:00
Benny Halevy	87d9c4d7f8	sstables: filesystem_storage::change_state: simplify log message When moving to the base directory, the printout currently looks broken: ``` INFO 2023-04-16 09:15:58,631 [shard 0] sstable - Moving sstable .../data/ks/cf-4c1bb670dc3711ed96733daf102e4aab/upload/md-1-big-Data.db to in ".../data/ks/cf-4c1bb670dc3711ed96733daf102e4aab/" ``` Since `path` already contains `to`, the message can be just simplified and `to` need not be printed explicitly. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #13525	2023-04-24 13:43:48 +03:00
Kefu Chai	4f21755c98	timeout_config: correct the misconfigured {truncate, other}_timeout this change fixes the regression introduced by `ebf5e138e8`, which * initialized `truncate_timeout_in_ms` with `counter_write_request_timeout_in_ms`, * returns `cas_timeout_in_ms` in the place of `other_timeout_in_ms`. in this change, these two misconfigurations are fixed. Fixes #13633 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13639	2023-04-24 12:26:14 +03:00
Kefu Chai	2c91728d8a	auth: do not include unused header in `5a9b4c02e3`, the iostream based formatter was dropped, there is no need to include `<iostream>` or `<iosfwd>` in these source files anymore. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13643	2023-04-24 12:24:29 +03:00
Kefu Chai	642854f36f	test: s/os.P_NOWAIT/os.WNOHANG/ `os.P_NOWAIT` is supposed to be used in spawn calls, while `os.WNOHANG` is used as in the options parameter passed to wait calls. fortunately, `P_NOWAIT` is defined as "1" in CPython, and `os.WNOHANG` is defined as "1" in linux kernel. that's why the existing implementation works. but we should not rely on this coincidence. so, in this change, `os.P_NOWAIT` is replaced with `os.WNOHANG` for correctness and for better readability. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13646	2023-04-24 11:42:34 +03:00
Kefu Chai	a573a89128	keys: print "non-utf8-key" when clustering_key is not UTF-8 before this change we do not check if the clustering_key to be formatted is UTF-8 encoded before printing it. but we do perform the validation when printing paritition_keys. since the clustering_key is not different from partition_key when it comes to encoding, actually they are different parts of a parimary key. so let's validate the encoding of clustering_key as well, when formatting it. this change is a follow-up of `85b21ba049`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13641	2023-04-24 10:40:23 +03:00
Botond Dénes	864d27f9af	Merge 'clear_gently: handle null unique_ptr and optional values' from Benny Halevy This series adds handling of null std::unique_ptr to utils::clear_gently and handling of std::optional and seastar::optimized_optional (both engaged and disengaged cases). Also, unit tests were added to tests the above cases. Fixes #13636 Closes #13638 * github.com:scylladb/scylladb: utils: clear_gently: add variants for optional values utils: clear_gently: do not clear null unique_ptr	2023-04-24 10:27:32 +03:00
Kefu Chai	c06b20431e	cdc: generation: use default-generated operator== now that C++20 generates operator== for us, these is no need to handcraft it manually. also, in C++17, the standard library offers default implementation of operator== for `std::variant<>`, so no need to implement it by ourselves. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13625	2023-04-24 10:13:28 +03:00
Botond Dénes	2d8d8043be	Merge 'Coroutinize system_keyspace::get_compaction_history' from Pavel Emelyanov Closes #13620 * github.com:scylladb/scylladb: system_keyspace: Fix indentation after previous patch system_keyspace: Coroutinize get_compaction_history()	2023-04-24 09:48:01 +03:00
Botond Dénes	9e757d9c6d	Merge 'De-globalize storage proxy' from Pavel Emelyanov All users of global proxy are gone (), proxy can be made fully main/cql_test_env local. () one test case still needs it, but can get it via cql_test_env Closes #13616 * github.com:scylladb/scylladb: code: Remove global proxy schema_change_test: Use proxy from cql_test_env test: Carry proxy reference on cql_test_env	2023-04-24 09:38:00 +03:00
Botond Dénes	1750bb34b7	Merge 'sstables, replica: add generation generator' from Kefu Chai this is the first step to the uuid-based generation identifier. the goal is to encapsulate the generation related logic in generator, so its consumers do not have to understand the difference between the int64_t based generation and UUID v1 based generation. this commit should not change the behavior of existing scylla. it just allows us to derive from `generation_generator` so we can have another generator which generates UUID based generation identifier. Closes #13073 * github.com:scylladb/scylladb: replica, test: create generation id using generator sstables: add generation_generator test: sstables: use generate_n for generating ids for testing	2023-04-24 09:31:08 +03:00
Botond Dénes	85abece927	Merge 'Restrict logging of current_backtrace to log_level' from Benny Halevy `seastar::current_backtrace()` can be quite heavey. When we pass it to a log message in relatively detailed log_level (debug/trace), we pay the price of `current_backtrace` every time, but we rarely print the message. Closes #13527 * github.com:scylladb/scylladb: locator/topology: call seastar::current_backtrace only when log_level is enabled schema_tables: call seastar::current_backtrace only when log_level is enabled	2023-04-24 08:50:32 +03:00
Botond Dénes	7f04d8231d	Merge 'gms: define and use generation and version types' from Benny Halevy This series cleans up the generation and value types used in gms / gossiper. Currently we use a blend of int, int32_t, and int64_t around messaging. This change defines gms::generation_type and gms::version_type as int32_t and add check in non-release modes that the respective int64 value passed over messaging do not overflow 32 bits. Closes #12966 * github.com:scylladb/scylladb: gossiper: version_generator: add {debug_,}validate_gossip_generation gms: gossip_digest: use generation_type and version_type gms: heart_beat_state: use generation_type and version_type gms: versioned_value: use version_type gms: version_generator: define version_type and generation_type strong types utils: move generation-number to gms utils: add tagged_integer gms: versioned_value: make members private scylla-gdb: add get_gms_versioned_value gms: versioned_value: delete unused compare_to function gms: gossip_digest: delete unused compare_to function	2023-04-24 08:44:48 +03:00
Maxim Korolyov	002bdd7ae7	doc: add jaeger integration docs Closes #13490	2023-04-24 08:26:53 +03:00
Chang Chen Chien	c25a718008	docs: fix typo in using-scylla/local-secondary-indexes.rst Closes #13607	2023-04-24 06:56:19 +03:00
Benny Halevy	002865018f	utils: clear_gently: add variants for optional values Implement clear_gently for std:;optional<T> and seastar::optimized_optional<T> and respective unit tests. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-23 21:34:02 +03:00
Benny Halevy	12877ad026	utils: clear_gently: do not clear null unique_ptr Otherwise the null pointer is dereferenced. Add a unit test reproducing the issue and testing this fix. Fixes #13636 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-23 21:33:11 +03:00
Pavel Emelyanov	5e201b9120	database: Remove compaction_manager.hh inclusion into database.hh The only reason why it's there (right next to compaction_fwd.hh) is because the database::table_truncate_state subclass needs the definition of compaction_manager::compaction_reenabler subclass. However, the former sub is not used outside of database.cc and can be defined in .cc. Keeping it outside of the header allows dropping the compaction_manager.hh from database.hh thus greatly reducing its fanout over the code (from ~180 indirect inclusions down to ~20). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13622	2023-04-23 16:27:11 +03:00
Benny Halevy	5520d3a8e3	gossiper: version_generator: add {debug_,}validate_gossip_generation Make sure that the int64_t generation we get over rpc fits in the int32_t generation_type we keep locally. Restrict this assertion to non-release builds. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-23 08:48:01 +03:00
Benny Halevy	5dc7b7811c	gms: gossip_digest: use generation_type and version_type Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-23 08:48:01 +03:00
Benny Halevy	4cdad8bc8b	gms: heart_beat_state: use generation_type and version_type Define default constructor as heart_beat_state(gms::generation_type(0)) Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-23 08:48:01 +03:00
Benny Halevy	b638571cb0	gms: versioned_value: use version_type Adjust scylla-gdb.get_gms_version_value to get the versioned_value version as version_type (utils::tagged_integer). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-23 08:48:01 +03:00
Benny Halevy	2d20ee7d61	gms: version_generator: define version_type and generation_type strong types Derived from utils::tagged_integer, using different tags, the types are incompatible with each other and require explicit typecasting to- and from- their value type. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-23 08:47:17 +03:00
Benny Halevy	d1817e9e1b	utils: move generation-number to gms Although get_generation_number implementation is completely generic, it is used exclusively to seed the gossip generation number. Following patches will define a strong gms::generation_id type and this function should return it. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-23 08:37:32 +03:00
Benny Halevy	f5f566bdd8	utils: add tagged_integer A generic template for defining strongly typed integer types. Use it here to replace raft::internal::tagged_uint64. Will be used for defining gms generation and version as strong and distinguishable types in following patches. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-23 08:37:32 +03:00
Benny Halevy	c5d819ce60	gms: versioned_value: make members private and provide accessor functions to get them. 1. So they can't be modified by mistake, as the versioned value is immutable. A new value must have a higher version. 2. Before making the version a strong gms::version_type. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-23 08:37:32 +03:00
Benny Halevy	5aaec73612	scylla-gdb: add get_gms_versioned_value Prepare for next patch that makes gms::versioned_value members private, and provides methods by the same name as the current members. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-23 08:37:32 +03:00
Benny Halevy	44a8db016a	gms: versioned_value: delete unused compare_to function Not only it is unused, it is wrong since it doesn't compare the value, only its version. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-23 08:37:32 +03:00
Benny Halevy	59e771be5c	gms: gossip_digest: delete unused compare_to function Not only it is unused, it is wrong since it doesn't compare the digest endpoint member. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-23 08:37:32 +03:00
Kefu Chai	c2488fc516	test: object_store: specify timeout just in case scylla does not behave as expected, so we can identify the issue and error out sooner without hang forever until the whole test timesout. this issue was identified by pylint, see https://pylint.readthedocs.io/en/latest/user_guide/messages/warning/missing-timeout.html Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-22 00:38:37 +08:00
Tomasz Grabiec	bd0b299322	Merge 'Manage CDC generations when bootstrapping nodes using Raft Group 0 topology coordinator' from Kamil Braun Introduce a new table `CDC_GENERATIONS_V3` (`system.cdc_generations_v3`). The table schema is a copy-paste of the `CDC_GENERATIONS_V2` schema. The difference is that V2 lives in `system_distributed_keyspace` and writes to it are distributed using regular `storage_proxy` replication mechanisms based on the token ring. The V3 table lives in `system_keyspace` and any mutations written to it will go through group 0. Extend the `TOPOLOGY` schema with new columns: - `new_cdc_generation_data_uuid` will be stored as part of a bootstrapping node's `ring_slice`, it stores UUID of a newly introduced CDC generation which is used as partition key for the `CDC_GENERATIONS_V3` table to access this new generation's data. It's a regular column, meaning that every row (corresponding to a node) will have its own. - `current_cdc_generation_uuid` and `current_cdc_generation_timestamp` together form the ID of the newest CDC generation in the cluster. (the uuid is the data key for `CDC_GENERATIONS_V3`, the timestamp is when the CDC generation starts operating). Those are static columns since there's a single newest CDC generation. When topology coordinator handles a request for node to join, calculate a new CDC generation using the bootstrapping node's tokens, translate it to mutation format, and insert this mutation to the CDC_GENERATIONS_V3 table through group 0 at the same time we assign tokens to the node in Raft topology. The partition key for this data is stored in the bootstrapping node's `ring_slice`. After inserting new CDC generation data , we need to pick a timestamp for this generation and commit it, telling all nodes in the cluster to start using the generation for CDC log writes once their clocks cross that timestamp. We introduce a separate step to the bootstrap saga, before `write_both_read_old`, called `commit_cdc_generation`. In this step, the coordinator takes the `new_cdc_generation_data_uuid` stored in a bootstrapping node's `ring_slice` - which serves as the key to the table where the CDC generation data is stored - and combines it with a timestamp which it generates a bit into the future (as in old gossiper-based code, we use 2 * ring_delay, by default 1 minute). This gives us a CDC generation ID which we commit into the topology state as the `current_cdc_generation_id` while switching the saga to the next step, `write_both_read_old`. Once a new CDC generation is committed to the cluster by the topology coordinator, we also need to publish it to the user-facing description tables so CDC applications know which streams to read from. This uses regular distributed table writes underneath (tables living in the `system_distributed` keyspace) so it requires `token_metadata` to be nonempty. We need a hack for the case of bootstrapping the first node in the cluster - turning the tokens into normal tokens earlier in the procedure in `token_metadata`, but this is fine for the single-node case since no streaming is happening. When a node notices that a new CDC generation was introduced in `storage_service::topology_state_load`, it updates its internal data structures that are used when coordinating writes to CDC log tables. We include the current CDC generation data in topology snapshot transfers. Some fixes and refactors included. Closes #13385 * github.com:scylladb/scylladb: docs: cdc: describe generation changes using group 0 topology coordinator cdc: generation_service: add a FIXME cdc: generation_service: add legacy_ prefix for gossiper-based functions storage_service: include current CDC generation data in topology snapshots db: system_keyspace: introduce `query_mutations` with range/slice storage_service: hold group 0 apply mutex when reading topology snapshot service: raft_group0_client: introduce `hold_read_apply_mutex` storage_service: use CDC generations introduced by Raft topology raft topology: publish new CDC generation to the user description tables raft topology: commit a new CDC generation on node bootstrap raft topology: create new CDC generation data during node bootstrap service: topology_state_machine: make topology::find const db: system_keyspace: small refactor of `load_topology_state` cdc: generation: extract pure parts of `make_new_generation` outside db: system_keyspace: add storage for CDC generations managed by group 0 service: topology_state_machine: better error checking for state name (de)serialization service: raft: plumbing `cdc::generation_service&` cdc: generation: `get_cdc_generation_mutations`: take timestamp as parameter cdc: generation: make `topology_description_generator::get_sharding_info` a parameter sys_dist_ks: make `get_cdc_generation_mutations` public sys_dist_ks: move find_schema outside `get_cdc_generation_mutations` sys_dist_ks: move mutation size threshold calculation outside `get_cdc_generation_mutations` service/raft: group0_state_machine: signal topology state machine in `load_snapshot`	2023-04-21 18:11:27 +02:00
Kefu Chai	f85da1bd30	test: object_store: s/exit/sys.exit/ the former is expected to be used in an interactive session, not in an application. see also: https://docs.python.org/3/library/constants.html#constants-added-by-the-site-module and https://docs.python.org/3/library/sys.html#sys.exit Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-21 23:25:59 +08:00
Kefu Chai	c7b62fbf81	test: object_store: do not declare a global variable for read we only need to declare a variable with `global` when we need to write to it, but if we just want to read it, there is no need to declare it. because the way how python looks up for a variable when reading from it enables python to find the global variables (and apparently the functions!). but when we assign a variable in python, the interpreter would have to tell in which scope the variable lives. by default the local scope is used, and a new variable is added to `locals()`. but in this case, we just read from it. so no need to add the `global` statement. see also https://docs.python.org/3/reference/simple_stmts.html#global Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-21 23:25:59 +08:00
Kefu Chai	4989a59a0b	test: object_store: remove unused imports Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-21 23:25:59 +08:00
Pavel Emelyanov	2aabaada9e	system_keyspace: Fix indentation after previous patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-21 17:32:57 +03:00
Pavel Emelyanov	6290849f11	system_keyspace: Coroutinize get_compaction_history() In order not to copy the rvalue consumer arg -- instantly convert it into value. No other tricks. Indentation is deliberately left broken. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-21 17:32:02 +03:00
Kefu Chai	576adbdbc5	replica, test: create generation id using generator reuse generation_generator for generating generation identifiers for less repeatings. also, add allow update generator to update its lastest known generation id. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-21 22:02:30 +08:00
Kefu Chai	6e82aa42d5	sstables: add generation_generator to prepare for the uuid-based generation identifier, where we will generate uuid-based generation idenfier if corresponding option is enabled, otherwise an integer based id. to reduce the repeatings, generation_generator is extracted out so it can be reused. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-21 21:51:13 +08:00
Anna Stuchlik	a68b976c91	doc: document `tombstone_gc` as not experimental The tombstone_gc was documented as experimental in version 5.0. It is no longer experimental in version 5.2. This commit updates the information about the option. Closes #13469	2023-04-21 14:43:25 +02:00
Botond Dénes	fcd7f6ac5f	Update tools/java submodule * tools/java c9be8583...eb3c43f8 (1): > Use EstimatedHistogram in metricPercentilesAsArray	2023-04-21 14:31:38 +03:00
Kefu Chai	a2aa133822	treewide: use std::lexicographical_compare_threeway this the standard library offers `std::lexicographical_compare_threeway()`, and we never uses the last two addition parameters which are not provided by `std::lexicographical_compare_threeway()`. there is no need to have the homebrew version of trichotomic compare function. in this change, * all occurrences of `lexicographical_tri_compare()` are replaced with `std::lexicographical_compare_threeway()`. * ``lexicographical_tri_compare()` is dropped. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13615	2023-04-21 14:28:18 +03:00
Kefu Chai	51fc0bc698	sstables: use default generated operator== C++20 compiler is able to generate defaulted operator== and operator!=. and the default generated operators behaves exactly the same as the ones crafted by us. so let's it do its job. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13614	2023-04-21 14:25:39 +03:00
Pavel Emelyanov	739455c3aa	code: Remove global proxy No code needs global proxy anymore. Keep on-stack values in main and cql_test_env and keep the pointer on debug:: namespace. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-21 14:18:59 +03:00
Pavel Emelyanov	f953fb2f52	schema_change_test: Use proxy from cql_test_env There's one place where test case calls for storage proxy and currently does it via global refernece. Time to switch it to cql_test_env's one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-21 14:18:00 +03:00
Pavel Emelyanov	681a19f54c	test: Carry proxy reference on cql_test_env All sharded<> services are created by cql_test_env on the stack. The cql_test_env() is then used to keep references on some of them and to export them to test cases via its methods. Proxy is missing on that exportable list, but will be needed, so add one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-21 14:16:54 +03:00
Botond Dénes	10c1f1dc80	Merge 'db: system_keyspace: use microsecond resolution for group0_history range tombstone' from Kamil Braun in `make_group0_history_state_id_mutation`, when adding a new entry to the group 0 history table, if the parameter `gc_older_than` is engaged, we create a range tombstone in the mutation which deletes entries older than the new one by `gc_older_than`. In particular if `gc_older_than = 0`, we want to delete all older entries. There was a subtle bug there: we were using millisecond resolution when generating the tombstone, while the provided state IDs used microsecond resolution. On a super fast machine it could happen that we managed to perform two schema changes in a single millisecond; this happened sometimes in `group0_test.test_group0_history_clearing_old_entries` on our new CI/promotion machines, causing the test to fail because the tombstone didn't clear the entry correspodning to the previous schema change when performing the next schema change (since they happened in the same millisecond). Use microsecond resolution to fix that. The consecutive state IDs used in group 0 mutations are guaranteed to be strictly monotonic at microsecond resolution (see `generate_group0_state_id` in service/raft/raft_group0_client.cc). Fixes #13594 Closes #13604 * github.com:scylladb/scylladb: db: system_keyspace: use microsecond resolution for group0_history range tombstone utils: UUID_gen: accept decimicroseconds in min_time_UUID	2023-04-21 14:08:56 +03:00
Kamil Braun	55f43e532c	Merge 'get rid of gms/failure_detector' from Benny Halevy Move gms::arrival_window to api/failure_detector which is its only user. and get rid of the rest, which is not used, now that we use direct_failure_detector instead. TODO: integare direct_failure_detector with failure_detector api. Closes #13576 * github.com:scylladb/scylladb: gms: get rid of unused failure_detector api: failure_detector: remove false dependency on failure_detector::arrival_window test: rest_api: add test_failure_detector	2023-04-21 11:47:44 +02:00
Kamil Braun	f7408130c9	Merge 'Fix topology management when raft-based topology is enabled' from Tomasz Grabiec Fixes a problem when raft-based topology is enabled, which loads topology from storage. It starts by clearing topology and then adding nodes one by one. Before this patch, this violates internal invariant of topology object which puts the local node as the first node. This would manifest by triggering an assert in topology::pop_node() which throws if popping the node at index 0 in order to keep the information about local node around. This is normally prevented by a check in topology::remove_node() which avoid calling pop_node() if removing the local node. But since there is no node which is marked as local, this check allows the first node to be popped. To fix the problem I lift the invariant that local node is always in _nodes. We still have information about local node in config. Instead of keeping it in _nodes, we recognize it as part of indexing. We also allow removing the local node like a regular node. The path which reloads topology works correctly after this, the local node will be recognized when (if) it is added to the topology. Fixes #13495 Closes #13498 * github.com:scylladb/scylladb: locator: topology: Fix move assignment locator: topology: Add printer tests: topology: Test that topology clearing preserves information about local node locator: topology: Recognize local node as part of indexing it locator: topology: Fix get_location(ep) for local node locator: topology: Fix typo locator: topology: Preserve config when cloning	2023-04-21 11:45:08 +02:00
Alejo Sanchez	ce87aedd30	test: topology smp test with custom cluster Instead of decommission of initial cluster, use custom cluster. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Closes #13589	2023-04-21 10:43:54 +02:00
Kamil Braun	f9d8118c8d	db: system_keyspace: use microsecond resolution for group0_history range tombstone in `make_group0_history_state_id_mutation`, when adding a new entry to the group 0 history table, if the parameter `gc_older_than` is engaged, we create a range tombstone in the mutation which deletes entries older than the new one by `gc_older_than`. In particular if `gc_older_than = 0`, we want to delete all older entries. There was a subtle bug there: we were using millisecond resolution when generating the tombstone, while the provided state IDs used microsecond resolution. On a super fast machine it could happen that we managed to perform two schema changes in a single millisecond; this happened sometimes in `group0_test.test_group0_history_clearing_old_entries` on our new CI/promotion machines, causing the test to fail because the tombstone didn't clear the entry correspodning to the previous schema change when performing the next schema change (since they happened in the same millisecond). Use microsecond resolution to fix that. The consecutive state IDs used in group 0 mutations are guaranteed to be strictly monotonic at microsecond resolution (see `generate_group0_state_id` in service/raft/raft_group0_client.cc). Fixes #13594	2023-04-21 10:33:05 +02:00
Kamil Braun	218a056825	utils: UUID_gen: accept decimicroseconds in min_time_UUID The function now accepts higher-resolution duration types, such as microsecond resolution timestamps. Will be used by the next commit.	2023-04-21 10:33:02 +02:00
Kefu Chai	b0ef053552	test: sstables: use generate_n for generating ids for testing so we don't need to keep a `prev_gen` around, this also prepares for the coming change to use generation generator. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-21 15:45:16 +08:00
Kefu Chai	ca6ebbd1f0	cql3, db: sstable: specialize fmt::formatter<function_name> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `function_name` without the help of `operator<<`. the corresponding `operator<<()` are dropped dropped in this change, as all its callers are now using fmtlib for formatting now. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13608	2023-04-21 10:07:28 +03:00
Botond Dénes	d74f3598f4	Merge 'dht: specialize fmt::formatter<dht::token>' from Kefu Chai this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `dht::token` without the help of `operator<<`. the corresponding `operator<<()` is preserved in this change, as it has lots of users in this project, we will tackle them case-by-case in follow-up changes. also, the forward declaration of `operator<<(ostream&, constdht::token&)` in `dht/i_partitioner.hh` is removed. ias it not necessary. Refs https://github.com/scylladb/scylladb/issues/13245 Closes #13610 * github.com:scylladb/scylladb: dht: remove unnecessarily forward declaration dht: specialize fmt::formatter<dht::token>	2023-04-21 09:51:25 +03:00
Kefu Chai	c5fa1ac9f7	sstable: specialize fmt::formatter<component_type> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `component_type` without the help of `operator<<`. the corresponding `operator<<()` are dropped dropped in this change, as all its callers are now using fmtlib for formatting now. also, please note, to enable fmtlib to format `std::set<component_type>` in `test/boost/sstable_3_x_test.cc` , we need to include `<fmt/ranges.h>` in that source file. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13598	2023-04-21 09:49:24 +03:00
Kefu Chai	9215adee46	streaming: specialize fmt::formatter<stream_reason> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `stream_reason` without the help of `operator<<`. please note, because we still cannot use the generic formatter for std::unordered_map provided by fmtlib, so in order to drop `operator<<` for `stream_reason`, and to print `unordered_map<stream_reason>`, `fmt::join()` is used as a temporary solution. we will audit all `fmt::join()` calls, after removing the homebrew formatter of `std::unordered_map`. the corresponding `operator<<()` are dropped dropped in this change, as all its callers are now using fmtlib for formatting now. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13609	2023-04-21 09:44:23 +03:00
Kefu Chai	ecb5380638	treewide: s/boost::lexical_cast<std::string>/fmt::to_string()/ this change replaces all occurrences of `boost::lexical_cast<std::string>` in the source tree with `fmt::to_string()`. for couple reasons: * `boost::lexical_cast<std::string>` is longer than `fmt::to_string()`, so the latter is easier to parse and read. * `boost::lexical_cast<std::string>` creates a stringstream under the hood, so it can use the `operator<<` to stringify the given object. but stringstream is known to be less performant than fmtlib. * we are migrating to fmtlib based formatting, see #13245. so using `fmt::to_string()` helps us to remove yet another dependency on `operator<<`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13611	2023-04-21 09:43:53 +03:00
Benny Halevy	3f1ac846d8	gms: get rid of unused failure_detector The legacy failure_detector is now unused and can be removed. TODO: integare direct_failure_detector with failure_detector api. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-21 09:08:27 +03:00
Benny Halevy	d546b92685	api: failure_detector: remove false dependency on failure_detector::arrival_window Up until `0ef33b71ba` get_endpoint_phi_values retrieved arrival samples from gms::get_arrival_samples(). That function was removed since it returned a constant ampty map. This patch returns empty results without relying on failure_detector::arrival_window, so the latter can be retired altogether. As Tomasz Grabiec <tgrabiec@scylladb.com> said: > I don't think the logic of arrival_window belongs to api, > it belongs to the failure detector. If there is no longers > a failure detector, there should be no arrival_window. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-21 09:08:25 +03:00
Benny Halevy	35de60670c	test: rest_api: add test_failure_detector Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-21 09:06:15 +03:00
Nadav Har'El	9c3907bb3c	test/cql-pytest: reproducers for incorrect AVG of "decimal" type This patch contains tests reproducing issue #13601 and the corresponding Cassandra issue CASSANDRA-18470. These issues are about what the AVG aggregation does for arbitrary-precision "decimal" numbers - the tests we add here show examples where the current behavior doesn't make sense: The problem is that "decimal" has arbitrary precision - so, should an average of 1/3 be returned as 0.3 or 0.33333333333333333? This is not specified, so Scylla (and Cassandra) decided to pick the result precision based on the input precision. In particular, the average of 1 and 2 is returned as 2 (zero digits after the decimal point, like in the inputs) instead of the expected 1.5. Arguably this isn't useful behavior. The test adds a second test which fails on Cassandra, but does pass on Scylla: Cassandra returns as the average of 1, 2, 2, 3 the integer 1 whereas the correct average is 2 (and Scylla returns it correctly). The reason why this bug is even worse on Cassandra is that Scylla's AVG only loses precision when dividing the sum and count, but Cassandra tries to maintain only the average, and loses precision at every step. Refs #13601 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13603	2023-04-21 08:32:30 +03:00
Kefu Chai	7b21bfd36e	mutation: specialize fmt::formatter<apply_resume> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `apply_resume` without the help of `operator<<`. the corresponding `operator<<()` are dropped dropped in this change, as all its callers are now using fmtlib for formatting now. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13584	2023-04-21 08:27:57 +03:00
Benny Halevy	77b70dbdb7	sstables: compressed_file_data_source_impl: get: throw malformed_sstable_exception on premature eof Currently, the reader might dereference a null pointer if the input stream reaches eof prematurely, and read_exactly returns an empty temporary_buffer. Detect this condition before dereferencing the buffer and sstables::malformed_sstable_exception. Fixes #13599 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #13600	2023-04-21 07:56:58 +03:00
Botond Dénes	d828cfcb23	Merge 'db, cql3: functions: switch argument passing to std::span' from Avi Kivity Database functions currently receive their arguments as an std::vector. This is inflexible (for example, one cannot use small_vector to reduce allocations). This series adapts the function signature to accept parameters using std::span. Some changes in the keys interface are needed to support this. Lastly, one call site is migrated to small_vector. This is in support of changing selectors to use expressions. Closes #13581 * github.com:scylladb/scylladb: cql3: abstract_function_selector: use small_vector for argument buffer db, cql3: functions: pass function parameters as a span instead of a vector keys: change from_optional_exploded to accept a span instead of a vector	2023-04-21 06:49:07 +03:00
Kefu Chai	fe9f41bd84	dht: remove unnecessarily forward declaration it turns out the declaration of `operator<<(ostream&, const dht::token&)` is unnecessarily. so let's drop it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-21 11:41:54 +08:00
Kefu Chai	53dedca8cd	dht: specialize fmt::formatter<dht::token> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `dht::token` without the help of `operator<<`. the corresponding `operator<<()` is preserved in this change, as it has lots of users in this project, we will tackle them case-by-case in follow-up changes. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-21 11:41:54 +08:00
Avi Kivity	0c64dd12b1	test: raft_server_test: fix string compare for clang 15 Clang 15 rejects string compares where the left-hand-side is a C string, so help it along by converting it ourselves. Closes #13582	2023-04-21 06:38:10 +03:00
Tomasz Grabiec	0ec700cd00	locator: topology: Fix move assignment Defaulted assignment doesn't update node::_topology.	2023-04-20 23:39:18 +02:00
Tomasz Grabiec	6ed841b8d7	locator: topology: Add printer	2023-04-20 23:39:18 +02:00
Tomasz Grabiec	3dfd49fe62	tests: topology: Test that topology clearing preserves information about local node	2023-04-20 23:39:18 +02:00
Tomasz Grabiec	7d3384089a	locator: topology: Recognize local node as part of indexing it Fixes a problem when raft-based topology is enabled, which loads topology from storage. It starts by clearing topology and then adding nodes one by one. Before this patch, this violates internal invariant of topology object which puts the local node as the first node. This would manifest by triggering an assert in topology::pop_node() which throws if popping the node at index 0 in order to keep the information about local node around. This is normally prevented by a check in topology::remove_node() which avoid calling pop_node() if removing the local node. But since there is no node which is marked as local, this check allows the first node to be popped. To fix the problem I lift the invariant that local node is always in _nodes. We still have information about local node in config. Instead of keeping it in _nodes, we recognize it as part of indexing. We also allow removing the local node like a regular node. The path which reloads topology works correctly after this, the local node will be recognized when (if) it is added to the topology. Fixes #13495	2023-04-20 23:39:18 +02:00
Tomasz Grabiec	eb9d6df8bf	locator: topology: Fix get_location(ep) for local node topology config may designate a different node than get_broadcast_address() as local node. In particular, some tests don't designate any node as the local node, which leads to logic errors where current get_location(ep) for ep which happens to have the address 127.0.0.1 returns location of the first node in _nodes rather than ep. Fix by looking up in _nodes first and fall back to local node if it's equal to configured local node (if any).	2023-04-20 23:39:18 +02:00
Tomasz Grabiec	0a675291dd	locator: topology: Fix typo	2023-04-20 23:39:18 +02:00
Tomasz Grabiec	0b1dfb2683	locator: topology: Preserve config when cloning Config is separate from state of the topology (nodes it contains). Preserving the config will make it easier in later patches to maintain invariants for cloned instances.	2023-04-20 23:39:18 +02:00
Botond Dénes	1426c623eb	Merge 'Tune up S3 unit tests environment usage (and a bit more)' from Pavel Emelyanov The tests in question are using MINIO_SERVER_ADDRESS environment variable to export minio server address from pylib to test cases. Also they use hard-coded public bucket name. Both plays badly with AWS S3, the former due to MINIO_... in its name and the latter because public bucket name can be any. So this PR puts address and public bucket name into S3_..._FOR_TEST environment variables and fixes output stream closure on failure while at it. Detached from #13493 Closes #13546 * github.com:scylladb/scylladb: s3/test: Rename MINIO_SERVER_ADDRESS environment variable s3/test: Keep public bucket name in environment s3/test: Fix upload stream closure test/lib: Add getenv_safe() helper	2023-04-20 18:01:12 +03:00
Kamil Braun	88aff50e8b	docs: cdc: describe generation changes using group 0 topology coordinator Update the `Generation switching` section: most of the existing description landed in `Gossiper-based topology changes` subsection, and a new subsection was added to describe Raft group 0 based topology changes. Marked as WIP - we expect further development in this area soon. The existing gossiper-based description was also updated a bit.	2023-04-20 16:36:41 +02:00
Kamil Braun	1688001585	cdc: generation_service: add a FIXME	2023-04-20 16:36:41 +02:00
Kamil Braun	d13a0b1930	cdc: generation_service: add legacy_ prefix for gossiper-based functions Most of the code in the service exists to handle gossiper-based topology changes. Name the functions appropriately and add a note in the comments.	2023-04-20 16:36:41 +02:00
Kamil Braun	8afb15700b	storage_service: include current CDC generation data in topology snapshots Note that we don't need to include earlier CDC generations, just the current (i.e. latest) one. We might observe a problem when nodes are being bootstrapped in quick succession - I left a FIXME describing the problem and possible solutions.	2023-04-20 16:36:41 +02:00
Kamil Braun	3d96bc5dba	db: system_keyspace: introduce `query_mutations` with range/slice There is a `query_mutations` function which loads the entire contents of a given table into memory. There was no function for e.g. loading just a single partition in the form of mutations. Introduce one.	2023-04-20 16:36:41 +02:00
Kamil Braun	3b26135227	storage_service: hold group 0 apply mutex when reading topology snapshot This is a bugfix: we need to hold the mutex when loading topology data from tables, otherwise they might be concurrently modified by `group0_state_machine::apply` and the snapshot that we send won't make any sense. Also specify in comments that the lock must be held during `topology_transition`, `topology_state_load`, `merge_topology_snapshot`.	2023-04-20 16:36:41 +02:00
Kamil Braun	f081de7cc5	service: raft_group0_client: introduce `hold_read_apply_mutex` We'll use it in `storage_service` topology snapshot request handler.	2023-04-20 16:36:41 +02:00
Kamil Braun	4c99b4004b	storage_service: use CDC generations introduced by Raft topology When a node notices that a new CDC generation was introduced in `storage_service::topology_state_load`, it updates its internal data structures that are used when coordinating writes to CDC log tables.	2023-04-20 16:36:41 +02:00
Kamil Braun	5f2b297f99	raft topology: publish new CDC generation to the user description tables Once a new CDC generation is committed to the cluster by the topology coordinator, we also need to publish it to the user-facing description tables so CDC applications know which streams to read from. This uses regular distributed table writes underneath (tables living in the `system_distributed` keyspace) so it requires `token_metadata` to be nonempty. We need a hack for the case of bootstrapping the first node in the cluster - turning the tokens into normal tokens earlier in the procedure in `token_metadata`, but this is fine for the single-node case since no streaming is happening.	2023-04-20 16:36:41 +02:00
Kamil Braun	58baf998c1	raft topology: commit a new CDC generation on node bootstrap After inserting new CDC generation data (see previous commit), we need to pick a timestamp for this generation and commit it, telling all nodes in the cluster to start using the generation for CDC log writes once their clocks cross that timestamp. We introduce a separate step to the bootstrap saga, before `write_both_read_old`, called `commit_cdc_generation`. In this step, the coordinator takes the `new_cdc_generation_data_uuid` stored in a bootstrapping node's `ring_slice` - which serves as the key to the table where the CDC generation data is stored - and combines it with a timestamp which it generates a bit into the future (as in old gossiper-based code, we use 2 * ring_delay, by default 1 minute). This gives us a CDC generation ID which we commit into the topology state as the `current_cdc_generation_id` while switching the saga to the next step, `write_both_read_old`. `system_keyspace::load_topology_state` is extended to load `current_cdc_generation_id`. For now, nodes don't react to `current_cdc_generation_id`. In later commit we'll extend `storage_service::topology_state_load` to start using the current CDC generation for CDC log table writes. The solution with specifying a timestamp into the future is the same as it is for gossip-based topology changes and it has the same consistency problem - if some node is temporarily partitioned away from the quorum, it might not learn about the new CDC generation before its clock crosses the generation's timestamp, causing it to temporarily send writes to the wrong CDC streams (until it learns about the new timestamp). I left a FIXME which describes an alternative solution which wasn't viable for gossiper-based topology changes, but it is viable when we have a fault-tolerant topology coordinator.	2023-04-20 16:36:41 +02:00
Kamil Braun	5942237a79	raft topology: create new CDC generation data during node bootstrap Calculate a new CDC generation using the bootstrapping node's tokens, translate it to mutation format, and insert this mutation to the CDC_GENERATIONS_V3 table through group 0 at the same time we assign tokens to the node in Raft topology. The partition key for this data is stored in the bootstrapping node's `ring_slice`. The data is inserted, but it's not used for anything yet, we'll do it in later commits. Two FIXMEs are left for follow-ups: - in `get_sharding_info` we shouldn't have to use the token owner's IP, but get the host ID directly from token metadata (#12279), - splitting the CDC generation data write into multiple commands. The comment elaborates.	2023-04-20 16:35:37 +02:00
Pavel Emelyanov	30b6f34a0b	s3/client: Explicitly set _upload_id empty when completing The upload_sink::_upload_id remains empty until upload starts, remains non-empty while it proceeds, then becomes empty again after it completes. The upload_started() method cheks that and on .close() started upload is aborted. The final switch to empty is done by std::move()ing the upload id into completion requrest, but it's better to use std::exchange() to emphasize the fact the the _upload_id becomes empty at that point for a reason. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13570	2023-04-20 17:32:08 +03:00
Kamil Braun	4e7628fa16	service: topology_state_machine: make topology::find const	2023-04-20 16:16:36 +02:00
Kamil Braun	22094f1509	db: system_keyspace: small refactor of `load_topology_state` The variables necessary for constructing a `ring_slice` are now living in a local block of code. This makes it easier to see which data is part of the `ring_slice` and will make it easier to add more data to `ring_slice` in following commits. Also add some more sanity checking.	2023-04-20 15:40:23 +02:00
Avi Kivity	1cd6d59578	Merge 'Remove global proxy usage from view_info::select_statement()' from Pavel Emelyanov The method needs proxy to get data_dictionary::database from to pass down to select_statement::prepare(). And a legacy bit that can come with data_dictionary::database as well. Fortunately, all the call traces that end up at select_statement() start inside table:: methods that have view_update_generator, or at view_builder::consumer that has reference to view_builder. Both services can share the database reference. However, the call traces in question pass through several code layers, so the PR adds data_dictionary::database to those layers one by one. Closes #13591 * github.com:scylladb/scylladb: view_info: Drop calls to get_local_storage_proxy() view_info: Add data_dictionary argument to select_statement() view_info: Add data_dictionary argument to partition_slice() method view_filter_checking_visitor: Construct with data_dictionary view: Carry data_dictionary arg through standalone helpers view_updates: Carry data_dictionary argument throug methods view_update_builder: Construct with data dictionary table: Push view_update_generator arg to affected_views() view: Add database getters to v._update_generator and v._builder	2023-04-20 16:40:06 +03:00
Kamil Braun	3abe0f0ad6	cdc: generation: extract pure parts of `make_new_generation` outside `cdc::generation_service::make_new_cdc_generation` would create a new CDC generation and insert it into the `CDC_GENERATIONS_V2` table these days. For Raft-based topology chnages we'll do the data insertion somewhere else - in topology coordinator code. So extract the parts for calculating the CDC generation to free-standing functions (these are almost pure calculations, modulo accessing RNG).	2023-04-20 15:38:59 +02:00
Kamil Braun	2233d8f54d	db: system_keyspace: add storage for CDC generations managed by group 0 The `CDC_GENERATIONS_V3` table schema is a copy-paste of the `CDC_GENERATIONS_V2` schema. The difference is that V2 lives in `system_distributed_keyspace` and writes to it are distributed using regular `storage_proxy` replication mechanisms based on the token ring. The V3 table lives in `system_keyspace` and any mutations written to it will go through group 0. Also extend the `TOPOLOGY` schema with new columns: - `new_cdc_generation_data_uuid` will be stored as part of a bootstrapping node's `ring_slice`, it stores UUID of a newly introduced CDC generation which is used as partition key for the `CDC_GENERATIONS_V3` table to access this new generation's data. It's a regular column, meaning that every row (corresponding to a node) will have its own. - `current_cdc_generation_uuid` and `current_cdc_generation_timestamp` together form the ID of the newest CDC generation in the cluster. (the uuid is the data key for `CDC_GENERATIONS_V3`, the timestamp is when the CDC generation starts operating). Those are static columns since there's a single newest CDC generation.	2023-04-20 15:38:58 +02:00
Kamil Braun	07382d634a	service: topology_state_machine: better error checking for state name (de)serialization For example: ``` std::ostream& operator<<(std::ostream& os, ring_slice::replication_state s) { os << replication_state_to_name_map[s]; return os; } ``` this would print an empty string if the state was missing from `replication_state_to_name_map` (because `operator[]` default-construct a value if it's missing). Use `find` instead and make it an error if the state is missing. Also turn `throw std::runtime_error` into `on_internal_error` in deserialization functions because failure to deserialize a state name is an internal error, not user error.	2023-04-20 15:38:37 +02:00
Kamil Braun	59b692e799	service: raft: plumbing `cdc::generation_service&` Pass a reference to the service into places. It shall be used later, by the group 0 state machine and topology coordinator.	2023-04-20 15:38:37 +02:00
Kamil Braun	1e9cf3badd	cdc: generation: `get_cdc_generation_mutations`: take timestamp as parameter The function would generate a mutation timestamp for itself, take it as parameter instead. We'll use timestamps provided by Group 0 APIs when creating CDC generations during Group 0- based topology changes.	2023-04-20 15:38:37 +02:00
Kamil Braun	85f4f1830b	cdc: generation: make `topology_description_generator::get_sharding_info` a parameter The function used to obtain the sharding info for a given node (its number of shards and ignore_msb_bits) was using gossiper application states. We want to reuse `topology_description_generator` to build CDC generations when doing Raft Group 0-based topology changes, so make `get_sharding_info` a parameter.	2023-04-20 15:38:37 +02:00
Kamil Braun	3e863d0e58	sys_dist_ks: make `get_cdc_generation_mutations` public It was a `static` function inside system_distributed_keyspace. Later it will be used for another table living in system_keyspace, so move it outside, to the CDC generations module, and make it accessible from other places.	2023-04-20 15:38:37 +02:00
Kamil Braun	ed133db709	sys_dist_ks: move find_schema outside `get_cdc_generation_mutations` The function will be reused for a different table.	2023-04-20 15:38:37 +02:00
Kamil Braun	0e84662910	sys_dist_ks: move mutation size threshold calculation outside `get_cdc_generation_mutations` The function turns a `cdc::topology_description` into a vector of mutations. It decides when to push_back a new mutation (instead of extending an existing one) based on certain parameters. This calculation is specific to where we insert the mutation later. Move the calculation outside, to the function which does the insertion. `get_cdc_generation_mutations` will be used outside this function later.	2023-04-20 15:38:37 +02:00
Kamil Braun	52366f33e5	service/raft: group0_state_machine: signal topology state machine in `load_snapshot` The `_topology_state_machine.event` condition variable should be signalled whenever the topology state is updated, including on snapshot load.	2023-04-20 15:38:37 +02:00
Avi Kivity	43a0b40082	Merge 'Remove global proxy usage from API handlers' from Pavel Emelyanov There are few places in the API handlers that call global proxy for their needs. Most of those places are easy to patch, because proxy is either at http_ctx thing right inside the handler code. Also there's a handler code in view_builder that needs proxy too, but it really needs topology, not proxy, and can get it elsewhere (the handler is coroutinized while at it) Closes #13593 * github.com:scylladb/scylladb: view: Get topology via database tokens view: Indentation fix after previous patch view: Coroutinuze view_builder::view_build_statuses() api: Use ctx.sp in storage service handler api,main: Unset storage_proxy API on stop api: Use ctx.sp in set_storage_proxy() routes	2023-04-20 16:31:31 +03:00
Botond Dénes	66ee73641e	test/cql-pytest/nodetool.py: no_autocompaction_context: use the correct API This `with` context is supposed to disable, then re-enable autocompaction for the given keyspaces, but it used the wrong API for it, it used the column_family/autocompaction API, which operates on column families, not keyspaces. This oversight led to a silent failure because the code didn't check the result of the request. Both are fixed in this patch: * switch to use `storage_service/auto_compaction/{keyspace}` endpoint * check the result of the API calls and report errors as exceptions Fixes: #13553 Closes #13568	2023-04-20 16:21:16 +03:00
Kamil Braun	8d7b5f1710	Merge 'test/pylib: topology fix asyncio fixture and fix logger' from Alecco Remove unnecessary asyncio marker and re-introduce top level logger instance. Closes #13561 * github.com:scylladb/scylladb: test/pylib: add missing logger test/pylib: remove unnecessary asyncio marker	2023-04-20 14:23:05 +02:00
Alejo Sanchez	11561a73cb	test/pylib: ManagerClient helpers to wait for... server to see other servers after start/restart When starting/restarting a server, provide a way to wait for the server to see at least n other servers. Also leave the implementation methods available for manual use and update previous tests, one to wait for a specific server to be seen, and one to wait for a specific server to not be seen (down). Fixes #13147 Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Closes #13438	2023-04-20 14:22:31 +02:00
Avi Kivity	342cdb2a63	Update tools/jmx submodule (split Depends line) * tools/jmx 15fd4ca...fdd0474 (1): > dist/debian: split Depends into multiple lines	2023-04-20 15:11:33 +03:00
Pavel Emelyanov	bda2aea5be	view: Get topology via database tokens The view_builder::view_build_statuses() needs topology to walk its nodes. Now it gets one from global proxy via its token metadata, but database also has tokens and view_builder has reference to database. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-20 13:18:14 +03:00
Pavel Emelyanov	403463d7eb	view: Indentation fix after previous patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-20 13:18:14 +03:00
Pavel Emelyanov	257814f443	view: Coroutinuze view_builder::view_build_statuses() Easier to patch it this way further. Indentation is deliberately left broken until next patch. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-20 13:17:59 +03:00
Pavel Emelyanov	ece731301c	api: Use ctx.sp in storage service handler Similarly to previous patch, but from another routes group. The storage service API calls mainly use storage service, but one place needs proxy to call recalculate_schema_version() with Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-20 13:14:52 +03:00
Pavel Emelyanov	21136058bd	api,main: Unset storage_proxy API on stop So that the routes referencing and using ctx.sp don't step on a proxy that's going to be removed (not now, but some time later) fron under them on shutdown. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-20 13:14:04 +03:00
Pavel Emelyanov	8d490d20dc	api: Use ctx.sp in set_storage_proxy() routes It's already used in many other places, few methods still stick to global proxy usage. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-20 13:12:49 +03:00
Alejo Sanchez	2c1ba377bf	test/pylib: add missing logger The logger instancewas removed in a previous commit but it is used in the wrapper helper. Add it back. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-04-20 10:36:02 +02:00
Alejo Sanchez	05338a6cd7	test/pylib: remove unnecessary asyncio marker Remove missing asyncio marker for fixture as this is only needed for tests. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-04-20 10:36:02 +02:00
Pavel Emelyanov	edcce7d8dd	view_info: Drop calls to get_local_storage_proxy() In both cases the proxy is called to get data_dictionary from. Now its available as the call argument. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-20 11:17:46 +03:00
Pavel Emelyanov	3e4fb7cad6	view_info: Add data_dictionary argument to select_statement() This method needs data_dictionary to work. Fortunately, all callers of it already have the dictionary at hand and can just pass it as argument. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-20 11:17:46 +03:00
Pavel Emelyanov	4375835cdd	view_info: Add data_dictionary argument to partition_slice() method The caller is calculate_affected_clustering_ranges() with dictionary arg, the method needs dictionary to call view_info::select_statement() later. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-20 11:17:46 +03:00
Pavel Emelyanov	0aff55cdb2	view_filter_checking_visitor: Construct with data_dictionary The visitor is wait-free helper for matches_view_filter() that has dictionary as its argument. Later the visitor will pass the dictionary to view_info::select_statement(). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-20 11:17:46 +03:00
Pavel Emelyanov	837fde84b1	view: Carry data_dictionary arg through standalone helpers There's a bunch of functions in view.{hh\|cc} that don't belong to any class and perform view-related claculations for view updates. Lots of them eventually call view_info::select_statement() which will later need the dictionary. By now all those methods' callers have data dictionary at hand and can share it via argument. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-20 11:17:46 +03:00
Pavel Emelyanov	1301a99ba3	view_updates: Carry data_dictionary argument throug methods The goal is to have the dictionary at places that later wrap calls to view_info::select_statement(). This graph of calls starts at the only public view_updates::generate_update() method which, in turn, is called from view_update_builder that already has data dictionary at hand. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-20 11:17:46 +03:00
Pavel Emelyanov	9d3d533561	view_update_builder: Construct with data dictionary The caller is table with view-update-generator at hand (it calls mutate_MV on). Builder here is used as a temporary object that destroys once the caller coroutine co_return-s, so keeping the database obtained from the view-update-generator is safe. Later the v.u.b. object will propagate its data dictionary down the callstacks. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-20 11:17:38 +03:00
Pavel Emelyanov	4a16ab3bd4	table: Push view_update_generator arg to affected_views() Caller already has it to call mutate_MV() on. The method in question will need the generator in one of the next patches. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-20 10:42:31 +03:00
Pavel Emelyanov	7ddcd0c918	view: Add database getters to v._update_generator and v._builder Both services carry database which will be used by auxiliary objects like view_updates, view_update_builder, consumer, etc in next patches. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-20 10:41:16 +03:00
Warren Krewenki	73eaebe338	Remove visible :orphan: The text `:orphan:` was showing up in the scylla.yaml documentation with no context. Closes #13524	2023-04-20 08:24:48 +03:00
Avi Kivity	9fb5443f87	cql3: abstract_function_selector: use small_vector for argument buffer abstract_function_selector uses a preallocated vector to store the arguments to aggregate functions, to prevent an allocation for every row. Use small_vector to prevent an allocation per query, if the number of arguments happens to be small. This isn't expected to make a significant performance difference.	2023-04-19 20:42:25 +03:00
Avi Kivity	3e0aacc8b5	db, cql3: functions: pass function parameters as a span instead of a vector Spans are more flexible and can be constructed from any contiguous container (such as small_vector), or a subrange of such a container. This can save allocations, so change the signature to accept a span. Spans cannot be constructed from std::initializer_list, so one such call site is changed to use construct a span directly from the single argument.	2023-04-19 20:38:55 +03:00
Avi Kivity	9072763a52	keys: change from_optional_exploded to accept a span instead of a vector A span is more generic than a vector, and can be constructed from any contiguous container (like small_vector), or a subset of a container. To support this, helpers in compound.hh need to use make_iterator_range, since a span doesn't fit the container concept (since spans don't own their contents). This is needed to make a similar change to function evaluation, as the token function passes its parameters to from_optional_exploded().	2023-04-19 20:18:50 +03:00
Avi Kivity	6ca1b14488	Update tools/jmx submodule (drop java 8 on debian) * tools/jmx 3316f7a...15fd4ca (1): > dist/debian: drop dependencies on jdk-8	2023-04-19 19:51:03 +03:00
Botond Dénes	0c430c01e9	Merge 'cql: allow SUM() aggregations which result in a NaN' from Nadav Har'El This short PR fixes a bug in SUM() aggregation where if the data contains +Inf and -Inf the returned sum should be NaN but we returned an error instead. This is a recent regression uncovered by a dtest (see issue #13551), but in the first patch we add additional tests in the cql-pytest framework which reproduce this bug and explore various other areas (wrongly) implicated by the failing dtest. Fixes #13551 Closes #13564 * github.com:scylladb/scylladb: cql3: allow SUM() aggregation to result in a NaN test/cql-pytest: add tests for data casts and inf in sums	2023-04-19 13:50:23 +03:00
Pavel Emelyanov	a77ca69360	s3/test: Rename MINIO_SERVER_ADDRESS environment variable Using it the pylib minio code export minio address for tests. This creates unneeded WTFs when running the test over AWS S3, so it's better to rename to variable not to mention MINIO at all. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-19 12:51:12 +03:00
Pavel Emelyanov	12c4e7d605	s3/test: Keep public bucket name in environment Local test.py runs minio with the public 'testbucket' bucket and all test cases know that. This series adds an ability to run tests over real S3 so the bucket name should be configurable. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-19 12:51:12 +03:00
Pavel Emelyanov	91674da982	s3/test: Fix upload stream closure If multipart upload fails for some reason the output stream remains not closed and the respective assertion masquerades the original failure. Fix that by closing the stream in all cases. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-19 12:51:12 +03:00
Pavel Emelyanov	b239e0d368	test/lib: Add getenv_safe() helper The helper is like ::getenv() but checks if the variable exists and throws descriptive exception. So instead of fatal error: in "...": std::logic_error: basic_string: construction from null is not valid one could get something like fatal error: in "...": std::logic_error: Environment variable ... not set Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-19 12:49:26 +03:00
Botond Dénes	ad065aaa62	Update tools/jmx submodule * tools/jmx e9bfaabd...3316f7a9 (2): > select-java: avoid exec multiple paths > select-java: extract function out	2023-04-19 11:18:19 +03:00
Nadav Har'El	81e0f5b581	cql3: allow SUM() aggregation to result in a NaN When floating-point data contains +Inf and -Inf, the sum is NaN. Our SUM() aggregation calculated this sum correctly, but then instead of returning it, complained that the sum overflowed by narrowing. This was a false positive: The sum() finalizer wanted to test that no precision was lost when casting the accumulator to the result type, so checked that the result before and after the cast are the same. But specifically for NaN, it is never equal to anything - not even to itself. This check is wrong for floating point, but moreover - isn't even necessary when the two types (accumulator type and result type) are identical so in this patch we skip it in this case. Note that in the current code, a different accumulator and result type is only used in the case of integer types; When accumulating floating point sums, the same type is used, so the broken check will be avoided. The test for this issue starts to pass with this patch, so the xfail tag is removed. Fixes #13551 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-04-19 09:31:41 +03:00
Nadav Har'El	5b792dde68	Merge 'Extend aws_sigv4 code to suite S3 client needs' from Pavel Emelyanov The AWS signature-generating code was moved from alternator some time ago as is. Now it's clear that in which places it should be extended to work for S3 client as well. The enhancements are - Support UNSIGNED-PAYLOAD to omit calculating checksums for request body - Include full URL path into the signature, not just hard-coded "/" string - Don't check datastamp expiration if not asked for This is a part of #13493 Closes #13535 * github.com:scylladb/scylladb: utils/aws: Brush up the aws_sigv4.hh header utils/aws: Export timepoint formatter utils/aws: Omit datestamp expiration checks when not needed utils/aws: Add canonical-uri argument utils/aws: Support unsigned-payload signatures	2023-04-18 16:33:52 +03:00
Pavel Emelyanov	9628d07adb	Put storage_service.hh on a diet By removing unneeded headers inclusions. At the cost of few more forward declarations and a couple of extra includes in other .cc files. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13552	2023-04-18 14:53:17 +03:00
Nadav Har'El	78555ba7f1	test/cql-pytest: add tests for data casts and inf in sums This patch adds tests to reproduce issue #13551. The issue, discovered by a dtest (cql_cast_test.py), claimed that either cast() or sum(cast()) from varint type broke. So we add two tests in cql-pytest: 1. A new test file, test_cast_data.py, for testing data casts (a CAST (...) as ... in a SELECT), starting with testing casts from varint to other types. The test uncovers a lot of interesting cases (it is heavily commented to explain these cases) but nothing there is wrong and all tests pass on Scylla. 2. An xfailing test for sum() aggregate of +Inf and -Inf. It turns out that this caused #13551. In Cassandra and older Scylla, the sum returned a NaN. In Scylla today, it generates a misleading error message. As usual, the tests were run on both Cassandra (4.1.1) and Scylla. Refs #13551. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-04-18 13:38:42 +03:00
Anna Stuchlik	3d25edf539	doc: remove the sequential repair option from docs Fixes https://github.com/scylladb/scylladb/issues/12132 The sequential repair mode is not supported. This commit removes the incorrect information from the documentation. Closes #13544	2023-04-18 09:45:48 +03:00
Tomasz Grabiec	a8f8f9f0ea	Merge 'raft topology: store `shard_count` and `ignore_msb` in topology' from Kamil Braun Add new columns to the `system.topology` table: `shard_count` and `ignore_msb`. When a node bootstraps or restarts and observes that the values stored in `topology` are different than the local values, it updates them. This is done in the `update_topology_with_local_metadata` function (the 'metadata' here being the two values). Additional flag persisted in `system.scylla_local` is used to safely avoid performing read barriers when the values didn't change on node restart. A comment in `update_topology_with_local_metadata` explains why this flag is needed. An example use case where `shard_count` and `ignore_msb` are needed is creating CDC generations. Fixes: #13508 Closes #13521 * github.com:scylladb/scylladb: raft topology: update `release_version` in topology on restart raft topology: store `shard_count` and `ignore_msb` in topology	2023-04-18 01:18:50 +02:00
Anna Stuchlik	da7a75fe7e	doc: remove in-memory tables from OSS docs Related: https://github.com/scylladb/scylladb/issues/13119 This commit removes the information about in-memory tables from the Open Source documentation, as it is an Enterprise-only feature. Closes #13496	2023-04-17 16:00:09 +03:00
Botond Dénes	de67978211	Update tools/jmx submodule * tools/jmx 826da61d...e9bfaabd (1): > metrics: revert 'metrics: EstimatedHistogram::getValues() returns bucketOffsets'	2023-04-17 15:42:11 +03:00
Avi Kivity	7724223134	Merge 'utils: big_decimal: optimize big_decimal::compare() and use <=> operator' from Kefu Chai in this series, we use <=> operator to replace `big_decimal::compare()` for better readability. also, we trade the chained ternary expression with a more verbose if-else statement for better performance and readability. Closes #13478 * github.com:scylladb/scylladb: utils: big_decimal: replace compare() with <=> operator utils: big_decimal: optimize big_decimal::compare()	2023-04-17 14:33:53 +03:00
Avi Kivity	7a42927a3d	treewide: stop using 'using namespace std' in namespace scope Such namespace-wide imports can create conflicts between names that are the same in seastar and std, such as {std,seastar}::future and {std,seastar}::format, since we also have 'using namespace seastar'. Replace the namespace imports with explicit qualification, or with specific name imports. Closes #13528	2023-04-17 14:08:37 +03:00
Botond Dénes	38c14a556a	Merge 'A couple of s3/client fixes found when testing over AWS S3' from Pavel Emelyanov This is a part of PR #13493 that contains found fixes for the client code itself. The original PR has some questions to resolve, so it's worth merging the fixes separately. Closes #13534 * github.com:scylladb/scylladb: s3/client: Add comments about multipart upload completion message s3/client: Fix succeeded/failed part upload final checking s3/client: Fix parts to start from 1	2023-04-17 13:33:12 +03:00
Botond Dénes	b8e47569e6	Merge 'doc: extend the information about the recommended RF on the Tracing page' from Anna Stuchlik Fixes https://github.com/scylladb/scylla-doc-issues/issues/823. This PR extends the note on the Tracing page to explain what is meant by setting the RF to ALL and adds a link for reference. Closes #12418 * github.com:scylladb/scylladb: docs: add an explanation to recommendation in the Note box doc: extend the information about the recommended RF on the Tracing page	2023-04-17 13:28:19 +03:00
Anna Stuchlik	2d2d92cf18	docs: add an explanation to recommendation in the Note box	2023-04-17 11:39:06 +02:00
Kamil Braun	a4159cc281	raft topology: update `release_version` in topology on restart Check on node start if local value of `release_version` changed. If it did, update it in `system.topology` like we do with `shard_count` and `ignore_msb`.	2023-04-17 10:52:05 +02:00
Kamil Braun	f9051dccaa	raft topology: store `shard_count` and `ignore_msb` in topology Add new columns to the `system.topology` table: `shard_count` and `ignore_msb`. When a node bootstraps or restarts and observes that the values stored in `topology` are different than the local values, it updates them. This is done in the `update_topology_with_local_metadata` function (the 'metadata' here being the two values). Additional flag persisted in `system.scylla_local` is used to safely avoid performing read barriers when the values didn't change on node restart. A comment in `update_topology_with_local_metadata` explains why this flag is needed. An example use case where `shard_count` and `ignore_msb` are needed is creating CDC generations. Fixes: #13508	2023-04-17 10:45:30 +02:00
Pavel Emelyanov	d09d6adbf4	utils/aws: Brush up the aws_sigv4.hh header Add lost pragma-once directive. Remove the hashers.hh inclusion. It was carried in when the whole code was detached from alternator (`f5de0582c8`), but this header is not needed in the header, only in the .cc file which uses sha256_hasher. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-17 11:16:45 +03:00
Pavel Emelyanov	792490e095	utils/aws: Export timepoint formatter The format of timestamp for AWS requests is defined in documentation, there's already the code that prepares it in this form. This patch exports this method so that S3 client could use it in next patches. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-17 11:14:45 +03:00
Pavel Emelyanov	706b60a0b0	utils/aws: Omit datestamp expiration checks when not needed The signing code is used in two ways -- by alternator to verify the arrived signed request and by S3 client to prepare the signed request. In the former case date expiration check is performed, but for the latter this is not required, because date stamp is most likely now (or close to it). So this patch makes the orig_datestamp argument optional meaning that expiration checks can be omited. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-17 11:14:45 +03:00
Pavel Emelyanov	c5ccef078a	utils/aws: Add canonical-uri argument Current signing code hard-codes the "/" as the URL, likely this just works for alternator. For S3 client the URL would include bucket and object name and should thus become the argument, not constant. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-17 11:14:45 +03:00
Pavel Emelyanov	8eabe9c4ef	utils/aws: Support unsigned-payload signatures For S3 signing the whole request payload can be too resource consuming. Fortunately, payload signing is only enforced if used with plain http, but with real S3 we're going to use signed requests over https only (see next patch why). Said that, the patch turns body-content into optional reference (i.e. -- a pointer) so that the signing code could inject the UNSIGNED-PAYLOAD mark instead of the payload signature and omit heavy payload signing. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-17 11:14:45 +03:00
Pavel Emelyanov	7c7a3416c5	s3/client: Add comments about multipart upload completion message The message length is pre-calculated in advance to provide correct content-length request header. This math is not obvious and deserves a comment. Also, the final message preparation code is also implicitly checking if any part failed to upload. There's a comment in the upload_sink's upload_part() method about it, but the finalization place deserves one too. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-17 11:08:34 +03:00
Pavel Emelyanov	3f86bed600	s3/client: Fix succeeded/failed part upload final checking When all parts upload complete the final message is prepared and sent out to the server. The preparation code is also responsible for checking if all parts uploaded OK by checking the part etag to be non-empty. In that check a misprint crept in -- the whole list is checked to be empty, not the individual etag itself. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-17 11:08:15 +03:00
Botond Dénes	6c889213bf	Merge 'Topology add node exception safety' from Benny Halevy Currently if index_node throws when trying to add an already indexed node, pop_node might unindex the existing node instead of the new one. Instead, with this change, unindex_node looks up the node by its pointer and removed it from the index map only if it's found there so to clean up safely after index_node throws (at any stage). Add a unit test to verify that. In addition, added a unit test to reproduce #13502 and test the fix. Closes #13512 * github.com:scylladb/scylladb: test: locator_topology: add test_update_node topology: add_node, unindex_node: make exception safe	2023-04-17 11:02:15 +03:00
Pavel Emelyanov	79379760e6	s3/client: Fix parts to start from 1 Docs say, that part numbers should start from 1, while the code follows the tradition and starts from 0. Minio is conveniently incompatible in this sense so test had been passing so far. On real S3 part number 0 ends up with failed request. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-17 10:43:12 +03:00
Botond Dénes	4c37dc5507	Merge 'keys: specialize fmt::formatter<partition_key> and friends' from Kefu Chai this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print following classes without the help of `operator<<`. - partition_key_view - partition_key - partition_key::with_schema_wrapper - key_with_schema - clustering_key_prefix - clustering_key_prefix::with_schema_wrapper the corresponding `operator<<()` are dropped dropped in this change, as all its callers are now using fmtlib for formatting now. the helper of `print_key()` is removed, as its only caller is `operator<<(std::ostream&, const clustering_key_prefix::with_schema_wrapper&)`. the reason why all these operators are replaced in one go is that we have a template function of `key_to_str()` in `db/large_data_handler.cc`. this template function is actually the caller of operator<< of `partition_key::with_schema_wrapper` and `clustering_key_prefix::with_schema_wrapper`. so, in order to drop either of these two operator<<, we need to remove both of them, so that we can switch over to `fmt::to_string()` in this template function. Refs scylladb#13245 Closes #13513 * github.com:scylladb/scylladb: keys: consolidate the formatter for partition_keys keys: specialize fmt::formatter<partition_key> and friends	2023-04-17 10:27:31 +03:00
Benny Halevy	58129fad92	locator/topology: call seastar::current_backtrace only when log_level is enabled `seastar::current_backtrace()` can be quite heavey. When we pass it to a log message in relatively detailed log_level (debug/trace), we pay the price of `current_backtrace` every time, but we rarely print the message. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-16 14:22:06 +03:00
Benny Halevy	490a0ae89b	schema_tables: call seastar::current_backtrace only when log_level is enabled `seastar::current_backtrace()` can be quite heavey. When we pass it to a log message in relatively detailed log_level (debug/trace), we pay the price of `current_backtrace` every time, but we rarely print the message. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-16 14:22:06 +03:00
Kefu Chai	6bb32efac0	utils: big_decimal: replace compare() with <=> operator now that we are using C++20, it'd be more convenient if we can use the <=> operator for comparing. the compiler creates the 6 other operators for us if the <=> operator is defined. so the code is more compacted. in this change, `big_decimal::compare()` is replaced with `operator<=>`, and its caller is updated accordingly. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-15 12:52:30 +08:00
Kefu Chai	e991e6087e	utils: big_decimal: optimize big_decimal::compare() before this change in the worst case, the underlying `number::compare()` gets called twice. as it is used by Boost::multiprecision to implement the comparing operators of `number`. but since we can have the result in one go, there is no need to to perform the comparison multiple times. so, in this change, we just call `number::compare()` explicitly, and use it to implement `compare()`. this should save a call of `number::compare()`. also, the chained ternary expression is replaced using if-else statement for better readability. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-15 12:52:30 +08:00
Pavel Emelyanov	c501163f95	Merge 'reader_permit: give better names to active* states' from Botond Dénes The names of these states have been the source of confusion ever since they were introduced. Give them names which better reflects their true meaning and gives less room for misinterpretation. The changes are: * active/unused -> active * active/used -> active/need_cpu * active/blocked -> active/await Hopefully the new names do a better job at conveying what these states really mean: * active - a regular admitted permit, which is active (as opposed to an inactive permit). * active/need_cpu - an active permit which was marked as needing CPU for the read to make progress. This permit prevents admission of new permits while it is in this state. * active/await - a former active/need_cpu permit, which has to wait on I/O or a remote shard. While in this state, it doesn't block the admission of new permits (pending other criteria such as resource availability). Closes #13482 * github.com:scylladb/scylladb: docs/dev/reader-concurrency-semaphore.md: expand on how the semaphore works reader_permit: give better names to active* states	2023-04-14 20:39:05 +03:00
Pavel Emelyanov	4e7f4b9303	Merge 'scripts/open-coredump.sh: allow user to plug in scylla package' from Botond Dénes Lately we have observed that some builds are missing the package_url in the build metadata. This is usually caused by changes in how build metadata is stored on the servers and the s3 reloc server failing to dig them out of the metadata files. A user can usually still obtain the package url but currently there is no way to plug in user-obtained scylla package into the script's workflow. This PR fixes this by allowing the user to provide the package as `$ARTIFACT_DIR/scylla.package` (in unpacked form). Closes #13519 * github.com:scylladb/scylladb: scripts/open-coredump.sh: allow bypassing the package downloading scripts/open-coredump.sh: check presence of mandatory field in build json object scripts/open-coredump.sh: more consistent error messaging	2023-04-14 20:35:06 +03:00
Benny Halevy	e18eb71fa3	test: locator_topology: add test_update_node Reproduces issue fixed in PR #13502 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-14 17:51:07 +03:00
Benny Halevy	e29994b2aa	topology: add_node, unindex_node: make exception safe Current if index_node throws when trying to add an already indexed node, pop_node might unindex the existing node instead of the new one. Instead, with this change, unindex_node looks up the node by its pointer and removed it from the index map only if it's found there so to clean up safely after index_node throws (at any stage). Add a unit test to verify that. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-14 17:51:05 +03:00
Tomasz Grabiec	952b455310	Merge ' tool/scylla-sstable: more flexibility in obtaining the schema' from Botond Dénes scylla-sstable currently has two ways to obtain the schema: * via a `schema.cql` file. * load schema definition from memory (only works for system tables). This meant that for most cases it was necessary to export the schema into a CQL format and write it to a file. This is very flexible. The sstable can be inspected anywhere, it doesn't have to be on the same host where it originates form. Yet in many cases the sstable is inspected on the same host where it originates from. In this cases, the schema is readily available in the schema tables on disk and it is plain annoying to have to export it into a file, just to quickly inspect an sstable file. This series solves this annoyance by providing a mechanism to load schemas from the on-disk schema tables. Furthermore, an auto-detect mechanism is provided to detect the location of these schema tables based on the path of the sstable, but if that fails, the tool check the usual locations of the scylla data dir, the scylla confguration file and even looks for environment variables that tell the location of these. The old methods are still supported. In fact, if a schema.cql is present in the working directory of the tool, it is preferred over any other method, allowing for an easy force-override. If the auto-detection magic fails, an error is printed to the console, advising the user to turn on debug level logging to see what went wrong. A comprehensive test is added which checks all the different schema loading mechanisms. The documentation is also updated to reflect the changes. This change breaks the backward-compatibility of the command-line API of the tool, as `--system-schema` is now just a flag, the keyspace and table names are supplied separately via the new `--keyspace` and `--table` options. I don't think this will break anybody's workflow as this tools is still lightly used, exactly because of the annoying way the schema has to be provided. Hopefully after this series, this will change. Example: ``` $ ./build/dev/scylla sstable dump-data /var/lib/scylla/data/ks/tbl2-d55ba230b9a811ed9ae8495671e9e4f8/quarantine/me-1-big-Data.db {"sstables":{"/var/lib/scylla/data/ks/tbl2-d55ba230b9a811ed9ae8495671e9e4f8/quarantine//me-1-big-Data.db":[{"key":{"token":"-3485513579396041028","raw":"000400000000","value":"0"},"clustering_elements":[{"type":"clustering-row","key":{"raw":"","value":""},"marker":{"timestamp":1677837047297728},"columns":{"v":{"is_live":true,"type":"regular","timestamp":1677837047297728,"value":"0"}}}]}]}} ``` As seen above, subdirectories like qurantine, staging etc are also supported. Fixes: https://github.com/scylladb/scylladb/issues/10126 Closes #13448 * github.com:scylladb/scylladb: test/cql-pytest: test_tools.py: add tests for schema loading test/cql-pytest: add no_autocompaction_context docs: scylla-sstable.rst: remove accidentally added copy-pasta docs: scylla-sstable.rst: remove paragraph with schema limitations docs: scylla-sstable.rst: update schema section test/cql-pytest: nodetool.py: add flush_keyspace() tools/scylla-sstable: reform schema loading mechanism tools/schema_loader: add load_schema_from_schema_tables() db/schema_tables: expose types schema	2023-04-14 16:46:26 +02:00
Botond Dénes	edc75f51ff	docs/dev/reader-concurrency-semaphore.md: expand on how the semaphore works Greatly expand on the details of how the semaphore works. Organize the content into thematic chapters to improve navigation. Improve formatting while at it.	2023-04-14 08:51:24 -04:00
Botond Dénes	943ae7fc69	reader_permit: give better names to active* states The names of these states have been the source of confusion ever since they were introduced. Give them names which better reflects their true meaning and gives less room for misinterpretation. The changes are: * active/unused -> active * active/used -> active/need_cpu * active/blocked -> active/await Hopefully the new names do a better job at conveying what these states really mean: * active - a regular admitted permit, which is active (as opposed to an inactive permit). * active/need_cpu - an active permit which was marked as needing CPU for the read to make progress. This permit prevents admission of new permits while it is in this state. * active/await - a former active/need_cpu permit, which has to wait on I/O or a remote shard. While in this state, it doesn't block the admission of new permits (pending other criteria such as resource availability).	2023-04-14 08:40:46 -04:00
Botond Dénes	cae79ef2c3	scripts/open-coredump.sh: allow bypassing the package downloading By allowing the user to plug a manually downloaded package. Consequently the "package_url" field of the build metadata is checked only if there is no user-provided extracted package. This allows working around builds for which the metadata server returns no "package_url", by allowing the user to locate and download the package themselves, providing it to the script by simply extracting it as $ARTIFACT_DIR/scylla.package.	2023-04-14 07:48:21 -04:00
Kamil Braun	200123624f	Merge 'test: reproducers for store mutation with schema change and host down' from Alecco Reproducers for https://github.com/scylladb/scylladb/issues/10770. (Already fixed in `15ebd59071`) Includes necessary improvements and fixes to `pylib`. Closes #12699 * github.com:scylladb/scylladb: test/pytest: reproducers for store mutation... test: pylib: Add a way to create cql connections with particular coordinators test/pylib: get gossiper alive endpoints test/topology: default replication factor 3 test/pylib: configurable replication factor	2023-04-14 13:47:51 +02:00
Botond Dénes	45fbdbe5f7	scripts/open-coredump.sh: check presence of mandatory field in build json object Mandatory fields missing in the build json object lead to obscure, unrelated error messages down the road. Avoid this by checking that all required fields all present and print an error message if any is missing.	2023-04-14 07:33:46 -04:00
Botond Dénes	4df5ec4080	scripts/open-coredump.sh: more consistent error messaging Start all erro messages with "error: ..." and log them to stderr.	2023-04-14 07:24:14 -04:00
Botond Dénes	38d6635afd	Update tools/java submodule * tools/java eddef023...c9be8583 (1): > README.md: drop cqlsh from README.md	2023-04-14 11:53:16 +03:00
Botond Dénes	7586491e1e	Update tools/jmx/ submodule * tools/jmx/ 57c16938...826da61d (4): > install.sh: do not create /usr/scylla/jmx in nonroot mode > install.sh: remove "echo done" > reloc-pkg: rename symlinks/scylla-jmx to select-java > install.sh: select java executable at runtime	2023-04-14 11:47:54 +03:00
Kefu Chai	c580e30ec7	cql3: expr: return more accurate error message for invalidated token() args before this change, we just print out the addresses of the elements in `column_defs`, if the arguments passed to `token()` function are not valid. this is not quite helpful from the user's perspective. as user would be more interested in the values. also, we could print more accurate error message for different error. in this change, following Cassandra 4.1's behavior, three cases are identified, and corresponding errors are returned respectively: * duplicated partition keys * wrong order of partition key * missing keys where, if the partition key order is wrong, instead of printing the keys specified by user, the correct order is printed in the error message for helping user to correct the `token()` function. for better performance, the checks are performed only if the keys do not match, based on the assumption that the error handling path is not likely to be executed. tests are added accordingly. they tested with Canssandra 4.1.1 also. Fixes #13468 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13470	2023-04-14 11:46:18 +03:00
Botond Dénes	4eb1bb460a	Update tools/python3 submodule * tools/python3 d2f57dd9...30b8fc21 (1): > create-relocatable-package.py: fix timestamp of executable files	2023-04-14 11:39:17 +03:00
Raphael S. Carvalho	47b2a0a1f6	data_directory: Describe storage options of a keyspace Description of storage options is important for S3, as one needs to know if underlying storage is either local or remote, and if the latter, details about it. This relies on server-side desc statement. $ ./bin/cqlsh.py -e "describe keyspace1;" CREATE KEYSPACE keyspace1 WITH replication = { ... } AND storage = {'type': 'S3', 'bucket': 'sstables', 'endpoint': '127.0.0.1:9000'} AND durable_writes = true; Fixes #13507. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #13510	2023-04-14 11:34:35 +03:00
Benny Halevy	054667d5b6	storage_service: node_ops_ctl: send_to_all: print correct set of nodes in nodes_down error message nodes_failed are printed by mistake, instead of nodes_down Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #13509	2023-04-14 11:31:20 +03:00
Botond Dénes	289ff821c9	Merge 'Remove global proxy usage from view builder's value_getter' from Pavel Emelyanov There's a legacy safety check in view code that needs to find a base table from its schema ID. To do it it calls for global storage proxy instance. The comment says that this code can be removed once computes_column feature is known by everyone. I'm not sure if that's the case, so here's more complicated yet less incompatible way to stop using global proxy instance. Closes #13504 * github.com:scylladb/scylladb: view: Remove unused view_ptr reference view: Carry backing-secondary-index bit via view builder view: Keep backing-seconday-index bool on value_getter table: Add const index manager sgetter	2023-04-14 11:23:23 +03:00
Kefu Chai	60ff230d54	create-relocatable-package.py: use f-string in `dcce0c96a9`, we should have used f-string for printing the return code of gzip subprocess. but the "f" prefix was missed. so, in this change, it is added. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13500	2023-04-14 08:29:33 +03:00
Raphael S. Carvalho	a47bac931c	Move TWCS option from table into TWCS itself enable_optimized_twcs_queries is specific to TWCS, therefore it belongs to TWCS, not replica::table. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #13489	2023-04-14 08:28:16 +03:00
Anna Stuchlik	989a75b2f7	doc: update the metrics between 5.2 and 2023.1 Related: https://github.com/scylladb/scylla-enterprise/issues/2794 This commit adds the information about the metric changes in version 2023.1 compared to version 5.2. This commit is part of the 5.2-to-2023.1 upgrade guide and must be backported to branch-5.2. Closes #13506	2023-04-14 08:23:53 +03:00
Kefu Chai	85b21ba049	keys: consolidate the formatter for partition_keys since there are two places formatting `with_schema_wrapper`, it'd be desirable if we can consolidate them. so, in this change, the formatting code is extracted into a helper, so we only have a single place for formatting the `with_schema_wrapper`s. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-14 13:21:30 +08:00
Kefu Chai	3738fcbe05	keys: specialize fmt::formatter<partition_key> and friends this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print following classes without the help of `operator<<`. - partition_key_view - partition_key - partition_key::with_schema_wrapper - key_with_schema - clustering_key_prefix - clustering_key_prefix::with_schema_wrapper the corresponding `operator<<()` are dropped dropped in this change, as all its callers are now using fmtlib for formatting now. the helper of `print_key()` is removed, as its only caller is `operator<<(std::ostream&, const clustering_key_prefix::with_schema_wrapper&)`. the reason why all these operators are replaced in one go is that we have a template function of `key_to_str()` in `db/large_data_handler.cc`. this template function is actually the caller of operator<< of `partition_key::with_schema_wrapper` and `clustering_key_prefix::with_schema_wrapper`. so, in order to drop either of these two operator<<, we need to remove both of them, so that we can switch over to `fmt::to_string()` in this template function. Refs scylladb#13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-14 13:21:30 +08:00
Botond Dénes	1da02706dd	Merge 'Discard SSTable bloom filter on load-and-stream' from Raphael "Raph" Carvalho Load-and-stream reads the entire content from SSTables, therefore it can afford to discard the bloom filter that might otherwise consume a significant amount of memory. Bloom filters are only needed by compaction and other replica::table operations that might want to check the presence of keys in the SSTable files, like single-partition reads. It's not uncommon to see Data:Filter ratio of less than 100:1, meaning that for ~300G of data, filters will take ~3G. In addition to saving memory footprint, it also reduces operation time as load-and-stream no longer have to read, parse and build the filters from disk into memory. Closes #13486 * github.com:scylladb/scylladb: sstable_loader: Discard SSTable bloom filter on load-and-stream sstables: Allow SSTable loading to discard bloom filter sstables: Allow sstable_directory user to feed custom sstable open config sstables: Move sstable_open_info into open_info.hh	2023-04-14 06:18:54 +03:00
Alejo Sanchez	9597822214	test/pytest: reproducers for store mutation... with schema change and host down Reproducers for a failure during lwt operation due to missing of a column mapping in schema history table. Issue #10770	2023-04-13 21:23:03 +02:00
Tomasz Grabiec	041ee3ffdd	test: pylib: Add a way to create cql connections with particular coordinators Usage: await manager.driver_connect(server=servers[0]) manager.cql.execute(f"...", execution_profile='whitelist')	2023-04-13 21:23:03 +02:00
Alejo Sanchez	62a945ccd5	test/pylib: get gossiper alive endpoints Helper to get list of gossiper alive endpoints from REST API. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-04-13 21:23:03 +02:00
Alejo Sanchez	08d754e13f	test/topology: default replication factor 3 For most tests there will be nodes down, increase replication factor to 3 to avoid having problems for partitions belonging to down nodes. Use replication factor 1 for raft upgrade tests.	2023-04-13 21:23:02 +02:00
Alejo Sanchez	3508a4e41e	test/pylib: configurable replication factor Make replication factor configurable for the RandomTables helper. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-04-13 21:23:02 +02:00
Benny Halevy	b71f229fc2	topology: node: update_node: do not override internal changed flag by state option Currently, opt_st overrides the internal `changed` flag by setting it with the opt_st changed status. Instead, it should use `\|=` to keep it true if it is already so. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #13502	2023-04-13 17:46:59 +02:00
Raphael S. Carvalho	fe6df3d270	sstable_loader: Discard SSTable bloom filter on load-and-stream Load-and-stream reads the entire content from SSTables, therefore it can afford to discard the bloom filter that might otherwise consume a significant amount of memory. Bloom filters are only needed by compaction and other replica::table operations that might want to check the presence of keys in the SSTable files, like single-partition reads. It's not uncommon to see Data:Filter ratio of less than 100:1, meaning that for ~300G of data, filters will take ~3G. In addition to saving memory footprint, it also reduces operation time as load-and-stream no longer have to read, parse and build the filters from disk into memory. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-13 11:34:22 -03:00
Raphael S. Carvalho	17261369ea	sstables: Allow SSTable loading to discard bloom filter If bloom filter is not loaded, it means that an always-present filter is used, which translates into the SSTable being opened on every single read. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-13 11:34:22 -03:00
Raphael S. Carvalho	1427a5ce98	sstables: Allow sstable_directory user to feed custom sstable open config This will be used by load-and-stream to load SSTables in its own customized way. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-13 11:34:16 -03:00
Raphael S. Carvalho	86516f4cef	sstables: Move sstable_open_info into open_info.hh So sstable_directory can access its definition without having to include sstables.hh. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-13 11:31:14 -03:00
Pavel Emelyanov	097cea11b2	view: Remove unused view_ptr reference After previous patch the value_getter::_view becomes unused and can be dropped. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-13 16:51:27 +03:00
Pavel Emelyanov	821c8b19a6	view: Carry backing-secondary-index bit via view builder When view builder constructs it populates itself with view updates. Later the updates may instantiate the value_getter-s which, in turn, would need to check if the view is backing secondary index. Good news is that when view builder constructs it has all the information at hand needed to evaluate this "backing" bit. It's then propagated down to value_getter via corresponding view_updates. The getter's _view field becomes unused after this change and is (void)-ed to make this patch compile. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-13 16:48:36 +03:00
Pavel Emelyanov	e8b5022343	view: Keep backing-seconday-index bool on value_getter The getter needs to check if the view is backing a secondary index. Currentl it's done inside the handle_computed_column() method, but it's more convenient if this bit is known during construction, so move it there. There are no places that can change this property between view_getter is created and the method in question is called. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-13 16:45:59 +03:00
Pavel Emelyanov	0d9da46428	table: Add const index manager sgetter To be used by next patch that will call this helper inside non-mutable lambda Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-13 16:45:16 +03:00
Botond Dénes	bd57471e54	reader_concurrency_semaphore: don't evict inactive readers needlessly Inactive readers should only be evicted to free up resources for waiting readers. Evicting them when waiters are not admitted for any other reason than resources is wasteful and leads to extra load later on when these evicted readers have to be recreated end requeued. This patch changes the logic on both the registering path and the admission path to not evict inactive readers unless there are readers actually waiting on resources. A unit-test is also added, reproducing the overly-agressive eviction and checking that it doesn't happen anymore. Fixes: #11803 Closes #13286	2023-04-13 15:20:18 +03:00
Pavel Emelyanov	b1501d4261	s3/client: Don't use designated initialization of sys stat struct It makes compiler complan about mis-ordered initialization of st_nlink vs st_mode on different arches. Current code (st_nlink before st_mode) compiled fine on x86, but fails on ARM which wants st_mode to come before st_nlink. Changing the order would, apparently, break x86 build with similar message. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13499	2023-04-13 15:13:56 +03:00
Kefu Chai	87170bf07a	build: cmake: add more tests this change should add the remaining tests under boost/ Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13494	2023-04-13 14:57:00 +03:00
Botond Dénes	e103ef3bcb	Update seastar submodule * seastar 1204efbc...ed7a0f54 (46): > gate: s/intgernal/internal/ > reactor: set reactor::_stopping to true on all shards > condition-variable: replace the coroutine wakeup task with a promise > tutorial: explain the buffer_size_t param of generator coroutine > log: call log_level_map explicitly in constructor > future: de-variadicate make_ready_future() and similar helpers > timer-set,scollectd: remove unnecessary ";" > util/conversion: remove inclusion guards > foreign_ptr: destroy: use run_in_background > abort_source, abortable_fifo: use is_nothrow_invocable_r_v<> > alien: add type constraint for alien::run_on and alien::submit_to > alien: add noexcept specifier for lambda passed to run_on() > test: alien_test: test alien::run_on() also > test: alien_test: throw if unexpected things happens > future: make API level 6 mandatory > api-level: update IDE fallback > core/on_internal_error: always log error with backtrace > future: make API level 5 mandatory > websocket: fix frame parsing. > websocket: fix frame assembling. > when_all: drop code for API_LEVEL < 4 > future: drop internal call_then_impl > future: when_all_succeed(): make API level 4 mandatory > reactor: trade comment for type constraints > sstring: s/is_invocable_r/is_invocable_r_v/ > doc: compatibility: document API levels 5 and 6 > demos: file_demo: pass a string_view to_open_file_dma() > TLS: Add issuer/subject info to verification error message > test: fstream_test: drop unnecessary API_LEVEL check > manual_clock: advance: use run_in_background to expire_timers > reactor: add run_in_backround and close > websocket: shutdown input first. > websocket: use gate to guard background tasks. > websocket: remove trailing spaces. > websocket_demo: ignore sleep_aborted exception. > websocket_demo: fix coredump. > fstream: drop API level 2 (make_file_output_stream() returning non-future) > core/sstring: do not use ostream_formatter > metrics: use fmt::to_string() when creating a label > backtrace: fix size calculation in dl_iterate_phdr > Downgrade expected stall detector warning to info > fix: Add missing inline code blocks > spawn_test: fix /bin/cat stuck in reading input. > reactor: pass fd opened in blocking mode to spawned process > reactor: skip sigaction if handler has been registered before. > reactor: allow registering handler multiple times for a signal.	2023-04-13 14:28:30 +03:00
Kefu Chai	29ca0009a2	dist/debian: do not Depend on ${shlibs:Depends} the substvar of `${shlibs:Depends}` is set by dh_shlibdeps, which inspects the ELF images being packaged to figure out the shared library dependencies for packages. but since `f3c3b9183c`, we just override the `override_dh_shlibdeps` target in debian/rules with no-op. as we take care of the shared library dependencies by vendoring the runtime dependencies by ourselves using the relocatable package. so this variable is never set. that's why `dpkg-gencontrol` complains when processing `debian/control` and trying to materialize the substvars. in this change, the occurances of `${shlibs:Depends}` are removed to silence the warnings from `dpkg-gencontrol`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13457	2023-04-13 08:34:05 +03:00
Raphael S. Carvalho	9760149e8d	compaction: Don't bump compaction shares during major execution Commit `49892a0`, back in 2018, bumps the compaction shares by 200 to guarantee a minimum base line. However, after commit `e3f561d`, major compaction runs in maintenance group meaning that bumping shares became completely irrelevant and only causes regular compaction to be unnecessarily more aggressive. Fixes #13487. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #13488	2023-04-13 08:20:25 +03:00
Botond Dénes	50ee4033a9	Update tools/jmx submodule * tools/jmx 602329c9...57c16938 (1): > install.sh: replace tab with spaces	2023-04-12 13:28:23 +03:00
Botond Dénes	5d0c0ae0c4	Merge 'token_metadata: use topology nodes for endpoint_to_host_id map' from Benny Halevy Currently, token_metadata_impl maintains a "shadow" endpoint to host_id map on top of the maps in topology. This series first reimplements the functions that currently use this map to use topology instead. Then the important users of `get_endpoint_to_host_id_map_for_reading`: node_ops_ctl and view_builder and converted to use a new `topology::for_each_node` function to process all nodes in topology directly, without going through `get_endpoint_to_host_id_map_for_reading`. Closes #13476 * github.com:scylladb/scylladb: view_builder: view_build_statuses: use topology::for_each_node storage_service: node_ops_ctl: refresh_sync_nodes: use topology::for_each_node topology: add for_each_node token_metadata: get endpoint to node map from topology	2023-04-12 10:33:02 +03:00
Botond Dénes	1440efa042	test/cql-pytest: test_tools.py: add tests for schema loading A set of comprehensive tests covering all the supported ways of providing the schema to scylla-sstable, either explicitely or implicitely (auto-detect).	2023-04-12 03:14:43 -04:00
Botond Dénes	76a7d3448f	test/cql-pytest: add no_autocompaction_context	2023-04-12 03:14:43 -04:00
Botond Dénes	b7a4304b69	docs: scylla-sstable.rst: remove accidentally added copy-pasta	2023-04-12 03:14:43 -04:00
Botond Dénes	1673f10f7a	docs: scylla-sstable.rst: remove paragraph with schema limitations The above file contained a paragraph explaining the limitations of `scylla-sstable.rst` w.r.t. automatically finding the schema. This no longer applies so remove it.	2023-04-12 03:14:43 -04:00
Botond Dénes	9f9beef8fd	docs: scylla-sstable.rst: update schema section With the recent changes to the ways schema can be provided to the tool.	2023-04-12 03:14:43 -04:00
Botond Dénes	222f624757	test/cql-pytest: nodetool.py: add flush_keyspace() It would have been better if `flush()` could have been called with a keyspace and optional table param, but changing it now is too much churn, so we add a dedicated method to flush a keyspace instead.	2023-04-12 03:14:43 -04:00
Botond Dénes	ffec1e5415	tools/scylla-sstable: reform schema loading mechanism So far, schema had to be provided via a schema.cql file, a file which contains the CQL definition of the table. This is flexible but annoying at the same time. Many times sstables the tool operates on are located in their table directory in a scylla data directory, where the schema tables are also available. To mitigate this, an alternative method to load the schema from memory was added which works for system tables. In this commit we extend this to work for all kind of tables: by auto-detecting where the scylla data directory is, and loading the schema tables from disk.	2023-04-12 03:14:43 -04:00
Botond Dénes	fd4c2f2077	tools/schema_loader: add load_schema_from_schema_tables() Allows loading the schema for the designated keyspace and table, from the system table sstables located on disk. The sstable files opened for read only.	2023-04-12 03:14:43 -04:00
Botond Dénes	63b266a988	db/schema_tables: expose types schema	2023-04-12 02:43:53 -04:00
Botond Dénes	0c51f72ad6	Merge 'utils, mutation: replace operator<<(..) with fmt formatter' from Kefu Chai this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `tombstone` and `shadowable_tombstone` without the help of fmt::ostream. and their `operator<<(ostream,..)` are dropped, as there are no users of them anymore. Refs #13245 Closes #13474 * github.com:scylladb/scylladb: mutation: specialize fmt::formatter<tombstone> and fmt::formatter<shadowable_tombstone> utils: specialize fmt::formatter<optional<>>	2023-04-12 09:32:56 +03:00
Kefu Chai	ff202723c6	utils: big_decimal: specialize fmt::formatter<big_decimal> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `big_decimal` without the help of `operator<<`. this operator is droppe in this change, as all its callers are now using fmtlib for formatting now. we might need to use fmtlib to implement `big_decimal::to_string()`, and use `fmt::to_string()` instead, but let's leave it for a follow-up change. Refs scylladb#13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13479	2023-04-12 09:20:50 +03:00
Botond Dénes	f82287a9af	Update tools/jmx/ submodule * tools/jmx/ b7ae52bc...602329c9 (1): > metrics: EstimatedHistogram::getValues() returns bucketOffsets	2023-04-12 09:17:57 +03:00
Botond Dénes	525b21042f	Merge 'Rewrite sstables keyspace compaction task' from Aleksandra Martyniuk Task manager task implementations of classes that cover rewrite sstables keyspace compaction which can be start through /storage_service/keyspace_compaction/ api. Top level task covers the whole compaction and creates child tasks on each shard. Closes #12714 * github.com:scylladb/scylladb: test: extend test_compaction_task.py to test rewrite sstables compaction compaction: create task manager's task for rewrite sstables keyspace compaction on one shard compaction: create task manager's task for rewrite sstables keyspace compaction compaction: create rewrite_sstables_compaction_task_impl	2023-04-12 08:38:59 +03:00
Aleksandra Martyniuk	25cfffc3ae	compaction: rename local_offstrategy_keyspace_compaction_task_impl to shard_offstrategy_keyspace_compaction_task_impl Closes #13475	2023-04-12 08:38:25 +03:00
Kefu Chai	1cb95b8cff	mutation: specialize fmt::formatter<tombstone> and fmt::formatter<shadowable_tombstone> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `tombstone` and `shadowable_tombstone` without the help of `operator<<`. in this change, only `operator<<(ostream&, const shadowable_tombstone&)` is dropped, and all its callers are now using fmtlib for formatting the instances of `shadowable_tombstone` now. `operator<<(ostream&, const tombstone&)` is preserved. as it is still used by Boost::test for printing the operands in case the comparing tests fail. please note, before this change we were using a concrete string for indent. after this change, some of the places are changed to using fmtlib for indent. Refs scylladb#13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-12 10:57:03 +08:00
Kefu Chai	c980bd54ad	utils: specialize fmt::formatter<optional<>> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `optional<T>` without the help of `operator<<()`. this change also enables us to ditch more `operator<<()`s in future. as we are relying on `operator<<(ostream&, const optional<T>&)` for printing instances of `optional<T>`, and `operator<<(ostream&, const optional<T>&)` in turn uses the `operator<<(ostream&, const T&)`. so, the new specialization of `fmt::formatter<optional<>>` will remove yet another caller of these operators. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-12 10:57:03 +08:00
Benny Halevy	535b71eba3	view_builder: view_build_statuses: use topology::for_each_node Instead of tmptr->get_endpoint_to_host_id_map_for_reading. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-11 18:14:51 +03:00
Benny Halevy	d89fb02d24	storage_service: node_ops_ctl: refresh_sync_nodes: use topology::for_each_node Instead of tmptr->get_endpoint_to_host_id_map_for_reading. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-11 18:14:47 +03:00
Kefu Chai	59579d5876	utils: fragment_range: specialize fmt::formatter<FragmentedView> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print classes fulfill the requirement of `FragmentedView` concept without the help of template function of `to_hex()`, this function is dropped in this change, as all its callers are now using fmtlib for formatting now. the helper of `fragment_to_hex()` is dropped as well, its only caller is `to_hex()`. Refs scylladb#13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13471	2023-04-11 16:09:38 +03:00
Benny Halevy	7b76369ffc	topology: add for_each_node To eventually replace token_metadata::get_endpoint_to_host_id_map_for_reading Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-11 15:55:39 +03:00
Benny Halevy	e635aa30d6	token_metadata: get endpoint to node map from topology Don't maintain a "shadow" endpoint_to_host_id_map in token_metadata_impl. Instead, get the nodes_by_endpoint map from topology and use it to build the endpoint_to_host_id_map. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-11 15:48:30 +03:00
Botond Dénes	f1bbf705f9	Merge 'Cleanup sstables in resharding and other compaction types' from Benny Halevy This series extends sstable cleanup to resharding and other (offstrategy, major, and regular) compaction types so to: * cleanup uploaded sstables (#11933) * cleanup staging sstables after they are moved back to the main directory and become eligible for compaction (#9559) When perform_cleanup is called, all sstables are scanned, and those that require cleanup are marked as such, and are added for tracking to table_state::cleanup_sstable_set. They are removed from that set once released by compaction. Along with that sstables set, we keep the owned_ranges_ptr used by cleanup in the table_state to allow other compaction types (offstrategy, major, or regular) to cleanup those sstables that are marked as require_cleanup and that were skipped by cleanup compaction for either being in the maintenance set (requiring offstrategy compaction) or in staging. Resharding is using a more straightforward mechanism of passing the owned token ranges when resharding uploaded sstables and using it to detect sstable that require cleanup, now done as piggybacked on resharding compaction. Closes #12422 * github.com:scylladb/scylladb: table: discard_sstables: update_sstable_cleanup_state when deleting sstables compaction_manager: compact_sstables: retrieve owned ranges if required sstables: add a printer for shared_sstable compaction_manager: keep owned_ranges_ptr in compaction_state compaction_manager: perform_cleanup: keep sstables in compaction_state::sstables_requiring_cleanup compaction: refactor compaction_state out of compaction_manager compaction: refactor compaction_fwd.hh out of compaction_descriptor.hh compaction_manager: compacting_sstable_registration: keep a ref to the compaction_state compaction_manager: refactor get_candidates compaction_manager: get_candidates: mark as const table, compaction_manager: add requires_cleanup sstable_set: add for_each_sstable_until distributed_loader: reshard: update sstable cleanup state table, compaction_manager: add update_sstable_cleanup_state compaction_manager: needs_cleanup: delete unused schema param compaction_manager: perform_cleanup: disallow empty sorted_owened_ranges distributed_loader: reshard: consider sstables for cleanup distributed_loader: process_upload_dir: pass owned_ranges_ptr to reshard distributed_loader: reshard: add optional owned_ranges_ptr param distributed_loader: reshard: get a ref to table_state distributed_loader: reshard: capture creator by ref distributed_loader: reshard: reserve num_jobs buckets compaction: move owned ranges filtering to base class compaction: move owned_ranges into descriptor	2023-04-11 14:52:29 +03:00
Botond Dénes	38c98b370f	Update tools/jmx/ submodule * tools/jmx/ 48e16998...b7ae52bc (1): > install.sh: do not fail if jre-11 is not installed	2023-04-11 14:51:31 +03:00
Kefu Chai	dcce0c96a9	create-relocatable-package.py: error out if pigz fails before this change, we don't error out even if pigz fails. but there is chance that pigz fails to create the gzip'ed relocatable tarball either due to environmental issues or some other problems, and we are not aware of this until packaging scripts like `reloc/build_rpm.sh` tries to ungzip this corrupted gzip file. in this change, if pigz's status code is not 0, the status code is printed, and create-relocatable-package.py will return 1. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13459	2023-04-11 14:29:25 +03:00
Aleksandra Martyniuk	e170fa1c99	test: extend test_compaction_task.py to test rewrite sstables compaction	2023-04-11 13:07:22 +02:00
Aleksandra Martyniuk	a93f044efa	compaction: create task manager's task for rewrite sstables keyspace compaction on one shard Implementation of task_manager's task that covers rewrite sstables keyspace compaction on one shard.	2023-04-11 13:07:17 +02:00
Botond Dénes	a8e59d9fb2	Merge 'Metrics relabel from file' from Amnon Heiman This series adds an option to read the relabel config from file. Most of Scylla's metrics are reported per-shard, some times they are also reported per scheduling groups or per tables. With modern hardware, this can quickly grow to a large number of metrics that overload Scylla and the collecting server. One of the main issues around metrics reduction is that many of the metrics are only helpful in certain situations. For example, Scylla monitoring only looks at a subset of the metrics. So in large deployments it would be helpful to scrap only those. An option to do that, would be to mark all dashboards related metrics with a label value, and then Prometheus will request only metrics with that label value. There are two main limitations to scrap by label values: 1. some of the metrics we want to report are in seastar, so we'll need to label them somehow (we cannot just add random labels to seastar metrics) 2. things change, new metrics are introduce and we may want them, it's not practicall to re-compile and wait for a new release whenever we want to change a label just for monitoring. It will be best to have the option to add metrics freely and choose at runtime what to report. This series make use of Seastar API to perform metrics manipulation dynamically. It includes adding, removing, and changing labels and also enable and disable metrics, and enable and disable the skip_when_empty option. After this series the configuration could be used with: ```--relabel-config-file conf.yaml``` The general logic and format follows Prometheus metrics_relabel_config configuration. Where the configuration file looks like: ``` $ cat conf.yaml relabel_configs: - source_labels: [shard] action: drop target_label: shard regex: (2) - source_labels: [shard] action: replace target_label: level replacement: $1 regex: (.3) ``` Closes #12687 github.com:scylladb/scylladb: main: Load metrics relabel config from a file if it exists Add relabel from file support.	2023-04-11 12:47:09 +03:00
Aleksandra Martyniuk	c4098df4ec	compaction: create task manager's task for rewrite sstables keyspace compaction Implementation of task_manager's task covering rewrite sstables keyspace compaction that can be started through storage_service api.	2023-04-11 11:04:21 +02:00
Aleksandra Martyniuk	814254adfd	compaction: create rewrite_sstables_compaction_task_impl rewrite_sstables_compaction_task_impl serves as a base class of all concrete rewrite sstables compaction task classes.	2023-04-11 11:03:09 +02:00
Botond Dénes	dba1d36aa6	Merge 'alternator: fix isolation of concurrent modifications to tags' from Nadav Har'El Alternator's implementation of TagResource, UntagResource and UpdateTimeToLive (the latter uses tags to store the TTL configuration) was unsafe for concurrent modifications - some of these modifications may be lost. This short series fixes the bug, and also adds (in the last patch) a test that reproduces the bug and verifies that it's fixed. The cause of the incorrect isolation was that we separately read the old tags and wrote the modified tags. In this series we introduce a new function, `modify_tags()` which can do both under one lock, so concurrent tag operations are serialized and therefore isolated as expected. Fixes #6389. Closes #13150 * github.com:scylladb/scylladb: test/alternator: test concurrent TagResource / UntagResource db/tags: drop unsafe update_tags() utility function alternator: isolate concurrent modification to tags db/tags: add safe modify_tags() utility functions migration_manager: expose access to storage_proxy	2023-04-11 11:17:23 +03:00
Anna Stuchlik	2921059ebb	doc: add a disclaimer about unsupported upgrade Fixes https://github.com/scylladb/scylla-enterprise/issues/2805 This commit adds the disclaimer that an upgrade by replacing the cluster nodes with nodes with a different release is not supported. Closes #13445	2023-04-11 10:47:39 +03:00
Kefu Chai	86b66a9875	build: cmake: drop test_table.CC this change mirrors the corresponding change in `configure.py` in `4b5b6a9010` . Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13461	2023-04-11 09:42:58 +03:00
Nadav Har'El	79114c5030	cql-pytest: translate Cassandra's tests for DELETE operations This is a translation of Cassandra's CQL unit test source file validation/operations/DeleteTest.java into our cql-pytest framework. There are 51 tests, and they did not reproduce any previously-unknown bug, but did provide additional reproducers for three known issues: Refs #4244 Add support for mixing token, multi- and single-column restrictions Refs #12474 DELETE prints misleading error message suggesting ALLOW FILTERING would work Refs #13250 one-element multi-column restriction should be handled like a single-column restriction Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13436	2023-04-11 09:10:11 +03:00
Botond Dénes	355583066e	Merge 'Reduce memory footprint of SSTable index summary' from Raphael "Raph" Carvalho SSTable summary is one of the components fully loaded into memory that may have a significant footprint. This series reduces the summary footprint by reducing the amount of token information that we need to keep in memory for each summary entry. Of course, the benefit of this size optimization is proportional to the amount of summary entries, which in turn is proportional to the number of partitions in a SSTable. Therefore we can say that this optimization will benefit the most tables which have tons of small-sized partitions, which will result in big summaries. Results: ``` BEFORE [1000000 pkeys] data size: 4035888890, summary -> memory footprint: 5843232, entries: 88158 [10000000 pkeys] data size: 40368888890, summary -> memory footprint: 55787128, entries: 844925 AFTER [1000000 pkeys] data size: 4035888890, summary -> memory footprint: 4351536, entries: 88158 [10000000 pkeys] data size: 40368888890, summary -> memory footprint: 42211984, entries: 844925 ``` That shows a 25% reduction in footprint, for both 1 and 10 million pkeys. Closes #13447 * github.com:scylladb/scylladb: sstables: Store raw token into summary entries sstables: Don't store token data into summary's memory pool	2023-04-11 08:29:11 +03:00
Botond Dénes	05b381bfa2	Merge 'Simple S3 storage for sstables' from Pavel Emelyanov The PR adds sstables storage backend that keeps all component files as S3 objects and system.sstables_registry ownership table that keeps track of what sstables objects belong to local node and their names. When a keyspace is configured with 'STORAGE = { 'type': 'S3' }' the respective class table object eventually gets the storage_options instance pointing to the target S3 endpoint and bucket. All the sstables created for that table attach the S3 storage implementation that maintains components' files as S3 objects. Writing to and reading from components is handled by the S3 client facilities from utils/. Changing the sstable state, which is -- moving between normal, staging and quarantine states -- is not yet implemented, but would eventually happen by updating entries in the sstables registry. To keep track of which node owns which objects, to provide bucket-wide uniqueness of object names and to maintain sstable state the storage driver keeps records in the system.sstables_registry ownership table. The table maps sstable location and generation to the object format, version, status-state () and (!) unique identifier (some time soon this identifier is supposed to be replaced with UUID sstables generations). The component object name is thus s3://bucket/uuid/component_basename. The registry is also used on boot. The distributed loader picks up sstables from all the tables found in schema and for S3-backed keyspaces it lists entries in the registry to a) identify those and b) get their unique S3-side identifiers to open by name. () About sstable's status and state. The state field is the part of today's sstable path on disk -- staging, quarantine, normal (root table data dir), etc. Since S3 doesn't have the renaming facility, moving sstable between those states is only possible by updating the entry in the registry. This is not yet implemented in this set (#13017) The status field tracks sstable' transition through its creation-deletion. It first starts with 'creating' status which corresponds to the today's TemporaryTOC file. After being created and written to the sstable moves into 'sealed' state which corresponds to the today's normal sstable being with the TOC file. To delete sstable atomically it first moves into 'removing' state which is equivalent to being in the deletion-log for the on-disk sstable. Once removed from the bucket, the entry is removed from the registry. To play with: 1. Start minio (installed by install-dependencies.sh) ``` export MINIO_ROOT_USER=${root_user} export MINIO_ROOT_PASSWORD=${root_pass} mkdir -p ${root_directory} minio server ${root_directory} ``` 2. Configure minio CLI, create anonymous bucket ``` mc config host rm local mc config host add local http://127.0.0.1:9000 ${root_user} ${root_pass} mc mb local/sstables mc anonymous set public local/sstables ``` 3. Start Scylla with object-storage feature enabled ``` scylla ... --experimental-features=keyspace-storage-options --workdir ${as_usual}``` 4. Create KS with S3 storage ``` create keyspace ... storage = { 'type': 'S3', 'endpoint': '127.0.0.1:9000', 'bucket': 'sstables' };``` The S3 client has a logger named "s3", it's useful to use on with `trace` verbosity. Closes #12523 * github.com:scylladb/scylladb: test: Add object-storage test distributed_loader: Print storage type when populating sstable_directory: Add ownership table components lister sstable_directory: Make components_lister and API sstable_directory: Create components lister based on storage options sstables: Add S3 storage implementation system_keyspace: Add ownership table system_keyspace: Plug to user sstables manager too sstable: Make storage instance based on storage options sstable_directory: Keep storage_options aboard sstable: Virtualize the helper that gets on-disk stats for sstable sstable, storage: Virtualize data sink making for small components sstable, storage: Virtualize data sink making for Data and Index sstable/writer: Shuffle writer::init_file_writers() sstable: Make storage an API utils: Add S3 readable file impl for random reads utils: Add S3 data sink for multipart upload utils: Add S3 client with basic ops cql-pytest: Add option to run scylla over stable directory test.py: Equip it with minio server sstables: Detach write_toc() helper	2023-04-11 08:17:25 +03:00
Benny Halevy	96660b2ef7	table: discard_sstables: update_sstable_cleanup_state when deleting sstables We need to remove the deleted sstables from update_sstable_cleanup_state otherwise their data and index files will remain opened and their storage space won't be reclaimed. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 23:37:56 +03:00
Benny Halevy	4db961ecac	compaction_manager: compact_sstables: retrieve owned ranges if required If any of the sstables to-be-compacted requires cleanup, retrive the owned_ranges_ptr from the table_state. With that, staging sstables will eventually be cleaned up via regular compaction. Refs #9559 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 23:36:10 +03:00
Benny Halevy	9105f9800c	sstables: add a printer for shared_sstable Refactor the printing logic in compaction::formatted_sstables_list out to sstables::to_string(const shared_sstable&, bool include_origin) and operator<<(const shared_sstable) on top of it. So that we can easily print std::vector<shared_sstable> from compaction_manager in the next patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 23:31:35 +03:00
Benny Halevy	d87925d9fc	compaction_manager: keep owned_ranges_ptr in compaction_state When perform_cleanup adds sstables to sstables_requiring_cleanup, also save the owned_ranges_ptr in the compaction_state so it could be used by other compaction types like regular, reshape, or major compaction. When the exhausted sstables are released, check if sstables_requiring_cleanup is empty, and if it is, clear also the owned_ranges_ptr. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 23:30:53 +03:00
Benny Halevy	c2bf0e0b72	compaction_manager: perform_cleanup: keep sstables in compaction_state::sstables_requiring_cleanup As a first step towards parallel cleanup by (regular) compaction and cleanup compaction, filter all sstables in perform_cleanup and keep the set of sstables in the compaction_state. Erase from that set when the sstables are unregistered from compaction. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 23:30:39 +03:00
Benny Halevy	b3192b9f16	compaction: refactor compaction_state out of compaction_manager To use it both from compaction_manager and compaction_descriptor in a following patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 23:28:16 +03:00
Benny Halevy	73280c0a15	compaction: refactor compaction_fwd.hh out of compaction_descriptor.hh So it can be used in the next patch that will refactor compaction_state out of class compaction_manager. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 23:19:04 +03:00
Benny Halevy	690697961c	compaction_manager: compacting_sstable_registration: keep a ref to the compaction_state To be used for managing sstables requiring cleanup. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 23:18:02 +03:00
Benny Halevy	cac60a09ac	compaction_manager: refactor get_candidates Allow getting candidates for compaction from an arbitrary range of sstable, not only the in_strategy_sstables. To be used by perform_cleanup to mark all sstables that require cleanup, even if they can't be compacted at this time. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 23:16:57 +03:00
Benny Halevy	bbfe839a73	compaction_manager: get_candidates: mark as const Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 23:16:12 +03:00
Benny Halevy	6ebafe74b9	table, compaction_manager: add requires_cleanup Returns true iff any of the sstables in the set requries cleanup. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 23:14:36 +03:00
Benny Halevy	d765686491	sstable_set: add for_each_sstable_until Calls a function on all sstables or until the function returns stop_iteration::yes. Change the sstable_set_impl interface to expose only for_each_sstable_until and let sstable_set::for_each_sstable use that, wrapping the void-returning function passed to it. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 23:11:58 +03:00
Benny Halevy	db7fa9f3be	distributed_loader: reshard: update sstable cleanup state Since the sstables are loaded from foreign open info we should mark them for cleanup if needed (and owned_ranges_ptr is provided). This will allow a later patch to enable filtering for cleanup only for sstable sets containing sstables that require cleanup. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 23:11:00 +03:00
Benny Halevy	d0690b64c1	table, compaction_manager: add update_sstable_cleanup_state update_sstable_cleanup_state calls needs_cleanup and inserts (or erases) the sstable into the respective compaction_state.sstables_requiring_cleanup set. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 23:10:55 +03:00
Benny Halevy	1baca96de1	compaction_manager: needs_cleanup: delete unused schema param It isn't needed. The sstable already has a schema. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 23:03:53 +03:00
Benny Halevy	ac9f8486ba	compaction_manager: perform_cleanup: disallow empty sorted_owened_ranges I'm not sure why this was originally supported, maybe for upgrade sstables where we may want to rewrite the sstables without filtering any tokens, but perform_sstable_upgrade is now following a different code path and uses `rewrite_sstables` directly, without pigybacking on cleanup. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 23:03:03 +03:00
Benny Halevy	ecbd112979	distributed_loader: reshard: consider sstables for cleanup When called from `process_upload_dir` we pass a list of owned tokens to `reshard`. When they are available, run resharding, with implicit cleanup, also on unshared sstables that need cleanup. Fixes #11933 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 23:01:38 +03:00
Benny Halevy	3ccbb28f2a	distributed_loader: process_upload_dir: pass owned_ranges_ptr to reshard To facilitate implicit cleanup of sstables via resharding. Refs #11933 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 22:59:38 +03:00
Benny Halevy	aa4b18f8fb	distributed_loader: reshard: add optional owned_ranges_ptr param For passing owned_ranges_ptr from distributed_loader::process_upload_dir. Refs #11933 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 22:57:41 +03:00
Benny Halevy	f540af930b	distributed_loader: reshard: get a ref to table_state We don't reference the table itself, only as_table_state. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 22:57:11 +03:00
Benny Halevy	c6b7fcc26f	distributed_loader: reshard: capture creator by ref Now that reshard is a coroutine, creator is preserved in the coroutine frame until completion so we can simply capture it by reference now. Note that previously it was moved into the compaction descriptor, but the capture wasn't mutable so it was copied anyhow and this change doesn't introduced a regression. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 22:56:07 +03:00
Benny Halevy	7c9d16ff96	distributed_loader: reshard: reserve num_jobs buckets We know in advance how many buckets we need. We still need to emplace the first bucket upfront. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 22:55:35 +03:00
Benny Halevy	0c6ce5af74	compaction: move owned ranges filtering to base class Move the token filtering logic down from cleanup_compaction to regular_compaction and class compaction so it can be reused by other compaction types. Create a _owned_ranges_checker in class compaction when _owned_ranges is engaged, and use it in compaction::setup to filter partitions based on the owned ranges. Ref scylladb/scylladb#12998 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 22:55:09 +03:00
Benny Halevy	09df04c919	compaction: move owned_ranges into descriptor Move the owned_ranges_ptr, currently used only by cleanup and upgrade compactions, to the generic compaction descriptor so we apply cleanup in other compaction types. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-10 22:52:12 +03:00
Pavel Emelyanov	fd817e199c	Merge 'auth: replace operator<<(..) with fmt formatter' from Kefu Chai this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `auth::auth_authentication_options` and `auth::resource_kind` without the help of fmt::ostream. and their `operator<<(ostream,..)` are dropped, as there are no users of them anymore. Refs #13245 Closes #13460 * github.com:scylladb/scylladb: auth: remove unused operator<<(.., resource_kind) auth: specialize fmt::formatter<resource_kind> auth: remove unused operator<<(.., authentication_option) auth: specialize fmt::formatter<authentication_option>	2023-04-10 17:05:09 +03:00
Pavel Emelyanov	21ef5bcc22	test: Add object-storage test The test does - starts scylla (over stable directory - creates S3-backed keyspace (minio is up and running by test.py already) - creates table in that keyspace and populates it with several rows - flushes the keyspace to make sstables hit the storage - checks that the ownership table is populated properly - restarts scylla - makes sure old entries exist Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:44:29 +03:00
Pavel Emelyanov	8b9e9671de	distributed_loader: Print storage type when populating On boot it's very useful to know which storage a table comes from, so add the respective info to existing log messages. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:44:29 +03:00
Pavel Emelyanov	f04c6cdf9a	sstable_directory: Add ownership table components lister When sstables are stored on object storage, they are "registered" in the system.sstables_registry ownership table. The sstable_directory is supposed to list sstables from this table, so here's the respective components lister. The lister is created by sstables_manager, by the time it's requested from the the system keyspace is already plugged. The lister only handles "sealed" sstables. Dangling ones are still ignored, this is to be fixed later. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:44:29 +03:00
Pavel Emelyanov	8bd9f7accf	sstable_directory: Make components_lister and API Now the lister is filesystem-specific. There will soon come another one for S3, so the sstable_directory should be prepared for that by making the lister an abstract class. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:44:29 +03:00
Pavel Emelyanov	5f7f0117e1	sstable_directory: Create components lister based on storage options The directory's lister is storage-specific and should be created differently for different storage options. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:44:29 +03:00
Pavel Emelyanov	950ee0efe8	sstables: Add S3 storage implementation The driver puts all componenets into s3://bucket/uuid/component_name objects where 'bucket' is the keyspace options configuration parameter, and the 'uuid' is the value obtained from the ownership table. E.g. s3://test_bucket/d0a743b0-ad38-11ed-85b5-39b6b0998182/Data.db The life-time is straightforward. Until sealed, the sstable has 'creating' status in the table, then it's updated to be 'sealed'. Prior to removing the objects the status is set to 'deleting' thus allowing the distributed loader to pick up the dangling objects un re-load (not yet implemented). Finally, the entry is deleted from the table. It needs the PR #12648 not to generate empty ks/cf directories on the local filesystem. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:44:29 +03:00
Pavel Emelyanov	08e9046d07	system_keyspace: Add ownership table The schema is CREATE TABLE system.sstables ( location text, generation bigint, format text, status text, uuid uuid, version text, PRIMARY KEY (location, generation) ) A sample entry looks like: location \| generation \| format \| status \| uuid \| version ---------------------------------------------------------------------+------------+--------+--------+--------------------------------------+--------- /data/object_storage_ks/test_table-d096a1e0ad3811ed85b539b6b0998182 \| 2 \| big \| sealed \| d0a743b0-ad38-11ed-85b5-39b6b0998182 \| me The uuid field points to the "folder" on the storage where the sstable components are. Like this: s3 `- test_bucket `- f7548f00-a64d-11ed-865a-0c1fbc116bb3 `- Data.db - Index.db - Filter.db - ... It's not very nice that the whole /var/lib/... path is in fact used as location, it needs the PR #12707 to fix this place. Also, the "status" part is not yet fully functional, it only supports three options: - creating -- the same as TemporaryTOC file exists on disk - sealed -- default state - deleting -- the analogy for the deletion log on disk The latter needs support from the distributed_loader, which's not yet there. In fact, distributes_loader also needs to be patched to actualy select entries from this table on load. Also it needs the mentioned PR #12707 to support staging and quarantine sstables. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:44:28 +03:00
Pavel Emelyanov	e34b86dd61	system_keyspace: Plug to user sstables manager too The sharded<sys_ks> instances are plugged to large data handler and compaction manager to maintain the circular dependency between these components via the interposing database instance. Do the same for user sstables manager, because S3 driver will need to update the local ownership table. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	4bb885b759	sstable: Make storage instance based on storage options This patch adds storage options lw-ptr to sstables_manager::make_sstable and makes the storage instance creation depend on the options. For local it just creates the filesystem storage instance, for S3 -- throws, but next patch will fix that. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	df026e2cb5	sstable_directory: Keep storage_options aboard The class in question will need to know the table's storage it will need to list sstables from. For that -- construct it with the storage options taken from table. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	c060f3a52f	sstable: Virtualize the helper that gets on-disk stats for sstable When opening an existing (or just sealed) sstable its components are stat()-ed to get the on-disk sizes and a bit more. Stat-ing a file by name on S3 is not (yet) implemented and doing it file-by-file can be quite terrible. So add a method to return sstable stats in a storage-specific manner. For S3 this can be implemented by getting the info from the ownership table (in the future). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	0ddd27cb29	sstable, storage: Virtualize data sink making for small components This time sstable needs to create a data sink for a component without having the file at hand. That's pretty much the same as in previous patch, but the mathod declaration differs slightly. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	ac1e56c9d9	sstable, storage: Virtualize data sink making for Data and Index Add the make_data_or_index_sink() virtual method and its implementation for filesystem_storage. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	1d4fcce5dd	sstable/writer: Shuffle writer::init_file_writers() The method needs to create two data sinks -- for Data and for Index files -- and then wrap it with more stuff (compression, checksums, streams, etc.). With S3 backend using file-output-stream won't work, becase S3 storage cannot provide writable file API (it has data_sink instead). This patch extracts file_data_sink creation so that it could be virtualized with storage API later. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	525a261a4e	sstable: Make storage an API Currently sstable carries a filesystem_storage instance on board. Next patches will make it possible to use some other storage with different data accessing methods. This patch makes sstable carry abstract storage interface and make the existing filesystem_storage implement it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	033fa107f8	utils: Add S3 readable file impl for random reads Sometimes an sstable is used for random read, sometimes -- for streamed read using the input stream. For both cases the file API can be provided, because S3 API allows random reads of arbitrary lengths. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	a4a64149a6	utils: Add S3 data sink for multipart upload Putting a large object into S3 using plain PUT is bad choice -- one need to collect the whole object in memory, then send it as a content-length request with plain body. Less memory stress is by using multipart upload, but multipart upload has its limitation -- each part should be at least 5Mb in size. For that reason using file API doesn't work -- file IO API operates with external memory buffers and the file impl would only have raw pointers to it. In order to collect 5Mb of chunk in RAM the impl would have to copy the memory which is not good. Unlike the file API data_sink API is more flexible, as it has temporary buffers at hand and can cache them in zero-copy manner. Having sad that, the S3 data_sink implementation is like this: * put(buffer): move the buffer into local cache, once the local cache grows above 5Mb send out the part * flush: send out whatever is in cache, then send upload completion request * close: check that the upload finihsed (in flush), abort the upload otherwise User of the API may (actually should) wrap the sink with output_stream and use it as any other output_stream. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	3745b5c715	utils: Add S3 client with basic ops Those include -- HEAD to get size, PUT to upload object in one go, GET to read the object as contigious buffer and DELETE to drop one. The client uses http client from seastar and just implements the S3 protocol using it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	ced8a07d09	cql-pytest: Add option to run scylla over stable directory The facilities in run.py script allow launching scylla over temporary directory, waiting for it to come alive, killing, etc. The limitation of those is that the work-dir create for scylla is tighly coupled with its pid. The object-storage test in next patches will need to check that the sstables are preserved on scylla restart and this hard binding of workdir to pid won't work. This patch generalizes the scylla run/abort helpers to accept an external directory to work on and adds a call to restart scylla process over existing directory. And one small related change here -- log file is opened in O_APPEND mode so that restarted scylla process continues writing into the old file. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	6dbe41d277	test.py: Equip it with minio server When test.py starts it activates a minio server inside test-dir and configures an anonymous bucket for test cases to run on Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:01 +03:00
Pavel Emelyanov	93c8b4b46b	sstables: Detach write_toc() helper When sstable is opened it generates a certain content into TOC file. In filesystem storage this first gets into TemporaryTOC one. Future S3 driver will need the same to put into TOC object. Not to produce duplicate code detach the content generation into a helper. Next patches will make use of it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-10 16:43:00 +03:00
Raphael S. Carvalho	01466be7b9	sstables: Store raw token into summary entries Scylla stores a dht::token into each summary entry, for convenience. But that costs us 16 bytes for each summary entry. That's because dht::token has a kind field in addition to data, both 64 bits. With 1kk partitions, each averaging 4k bytes, summary may end up with ~90k summary entries. So dht::token only will add ~1.5M to the memory footprint of summary. We know summary samples index keys, therefore all tokens in all summary entries cannot have any token kind other than 'key'. Therefore, we can save 8 bytes for each summary entry by storing a 64-bit raw token and converting it back into token whenever needed. Memory footprint of summary entries in a summary goes from sizeof(summary_entry) * entries.size(): 1771520 to sizeof(summary_entry) * entries.size(): 1417216 which is explained by the 8 bytes reduction per summary entry. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-10 10:26:04 -03:00
Raphael S. Carvalho	6b5cd9ac7b	sstables: Don't store token data into summary's memory pool summary has a memory pool, which is implemented as a set of contiguous buffer of exponentially increasing size, with the max size of 128k. This pool served for both storing keys of summary entries and their respective tokens. The summary entry itself just stores a string_view, which points to the actual data in the memory pool. Since this series `31593e1451`, which removed token_view, summary_entry stores the actual token, not just the view. Therefore, memory is being wasted, as SSTable loader / writer is unnecessarily storing the token data into the pool. With 11k summary entries, the footprint drops from 756004 to 624932. A 18% reduction. Of course, the reduction depends on factors like key size, where the key size can outweigh significantly this waste. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-10 09:59:11 -03:00
Tomasz Grabiec	64a87f4257	Merge 'Standardize node ops sync_nodes selection' from Benny Halevy Use token_metadata get_endpoint_to_host_id_map_for_reading to get all normal token owners for all node operations, rather than using gossip for some operation and token_metadata for others. Fixes #12862 Closes #13256 * github.com:scylladb/scylladb: storage_service: node ops: standardize sync_nodes selection storage_service: get_ignore_dead_nodes_for_replace: make static and rename to parse_node_list	2023-04-10 13:14:55 +02:00
Benny Halevy	cc42f00232	view: view_builder: start: demote sleep_aborted log error This is not really an error, so print it in debug log_level rather than error log_level. Fixes #13374 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #13462	2023-04-09 22:49:06 +03:00
Nadav Har'El	d26bb8c12d	Merge 'tree: migrate from std::regex to boost::regex' from Botond Dénes Except for where usage of `std::regex` is required by 3rd party library interfaces. As demonstrated countless times, std::regex's practice of using recursion for pattern matching can result in stack overflow, especially on AARCH64. The most recent incident happened after merging https://github.com/scylladb/scylladb/pull/13075, which (indirectly) uses `sstables::make_entry_descriptor()` to test whether a certain path is a valid scylla table path in a trial-and-error manner. This resulted in stacks blowing up in AARCH64. To prevent this, use the already tried and tested method of switching from `std::regex` to `boost::regex`. Don't wait until each of the `std::regex` sites explode, replace them all preemptively. Refs: https://github.com/scylladb/scylladb/issues/13404 Closes #13452 * github.com:scylladb/scylladb: test: s/std::regex/boost::regex/ utils: s/std::regex/boost::regex/ db/commitlog: s/std::regex/boost::regex/ types: s/std::regex/boost::regex/ index: s/std::regex/boost::regex/ duration.cc: s/std::regex/boost::regex/ cql3: s/std::regex/boost::regex/ thrift: s/std::regex/boost::regex/ sstables: use s/std::regex/boost::regex/	2023-04-09 18:47:41 +03:00
Kefu Chai	7a05cc3a06	thrift: initiaize _config first to avoid dangling reference in `c642ca9e73`, a reference to the a parameter `config` passed to the `thrift_server` 's constructor is passed down to `create_handler_factory()`, which keeps it so it can create connection handler on demand. but unfortunately, - the `config` parameter is a temporary variable - the `config` parameter is moved away in the constructor after `create_handler_factory()` is called hence we have a dangling reference when the factory created by `create_handler_factory()` tries to deference the reference when handling a new incoming connection. in this change, - the definitions of `_config` and `_handler_factory` member variables are transposed, so that the former is initialized first. - `_handler_factory` now keeps a reference to `_config`'s member variable, so that the weak reference it holds is always valid. Fixes #13455 Branches: none Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13456	2023-04-09 11:34:34 +03:00
Amnon Heiman	928727a57d	main: Load metrics relabel config from a file if it exists This patch reads the relabel config from a file if it exists. A problem with the file or metrics would stop Scylla from starting. This is on purpose, as it's a configuration problem that should be addressed. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2023-04-09 09:10:07 +03:00
Amnon Heiman	990545f616	Add relabel from file support. This patch adds a configuration with an optional file name for relabeling metrics. It also adds a function that accepts a file name and loads the relabel config from a file. An example for such a file: ``` $cat conf.yml relabel_configs: - source_labels: [shard] action: drop target_label: shard regex: (2) - source_labels: [shard] action: replace target_label: level replacement: $1 regex: (.*3) ``` update_relabel_config_from_file throws an exception on failure, it's up to the caller to decide what to do in such cases.	2023-04-09 09:10:02 +03:00
Kefu Chai	9d5fbe226e	auth: remove unused operator<<(.., resource_kind) since the only user of operator<<(..., resource_kind) is now `auth_resource_test`, let's just move it into this test. and there is no need to keep this operator in the header file where `resource_kind` is defined. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-07 20:32:28 +08:00
Kefu Chai	ca50a8d6c7	auth: specialize fmt::formatter<resource_kind> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `auth::resource_kind` without the help of fmt::ostream. its `operator<<(ostream,..)` is reimplemented using fmtlib accordingly to ease the review. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-07 18:59:13 +08:00
Kefu Chai	ca0ca92e68	auth: remove unused operator<<(.., authentication_option) since we already have fmt::formatter<authentication_option>, and there is no exiting users of `operator<<(ostream&, authentication_option)`, let's just drop it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-07 18:15:35 +08:00
Kefu Chai	ba0f9036ec	auth: specialize fmt::formatter<authentication_option> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `auth::auth_authentication_options` without the help of fmt::ostream. its `operator<<(ostream,..)` is reimplemented using fmtlib accordingly to ease the review. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-07 18:15:25 +08:00
Botond Dénes	452cb1a712	test: s/std::regex/boost::regex/ The former is prone to producing stack-overflow as it uses recursion in it match implementation. The migration is entirely mechanical.	2023-04-06 09:51:32 -04:00
Botond Dénes	985e33a768	utils: s/std::regex/boost::regex/ The former is prone to producing stack-overflow as it uses recursion in it match implementation. The migration is entirely mechanical.	2023-04-06 09:51:28 -04:00
Botond Dénes	52e66e38e7	db/commitlog: s/std::regex/boost::regex/ The former is prone to producing stack-overflow as it uses recursion in it match implementation. The migration is entirely mechanical.	2023-04-06 09:51:24 -04:00
Botond Dénes	712889c99f	types: s/std::regex/boost::regex/ The former is prone to producing stack-overflow as it uses recursion in it match implementation. The migration is entirely mechanical is for the most part. escape() needs some special treatment, looks like boost::regex wants double escaped bacspace.	2023-04-06 09:50:45 -04:00
Botond Dénes	cf188f40b9	index: s/std::regex/boost::regex/ The former is prone to producing stack-overflow as it uses recursion in it match implementation. The migration is entirely mechanical.	2023-04-06 09:50:41 -04:00
Botond Dénes	4a0188ea6a	duration.cc: s/std::regex/boost::regex/ The former is prone to producing stack-overflow as it uses recursion in it match implementation. The migration is entirely mechanical.	2023-04-06 09:50:37 -04:00
Botond Dénes	de402878e4	cql3: s/std::regex/boost::regex/ The former is prone to producing stack-overflow as it uses recursion in it match implementation. The migration is entirely mechanical.	2023-04-06 09:50:32 -04:00
Botond Dénes	c0b72f70d4	thrift: s/std::regex/boost::regex/ The former is prone to producing stack-overflow as it uses recursion in it match implementation. The migration is entirely mechanical.	2023-04-06 09:50:27 -04:00
Botond Dénes	ba031ad181	sstables: use s/std::regex/boost::regex/ The former is prone to producing stack-overflow as it uses recursion in it match implementation. The migration is entirely mechanical.	2023-04-06 09:50:12 -04:00
Botond Dénes	c65bd01174	Merge 'Debloat system_keyspace.hh (and a bit of .cc)' from Pavel Emelyanov The system_keyspace.hh now includes raft stuff, topology changes stuff, task_manager stuff, etc. It's going to include tablets.hh (but maybe not). Anything that deals with system keyspace, and includes system_keyspace.hh, would transitively pull these too. This header is becoming a central hub for all the features. This PR removes all the headers from system_keyspace.hh that correspond to other "subsystems" keeping only generic mutations/querying and seastar ones. Closes #13450 * github.com:scylladb/scylladb: system_keyspace.hh: Remove unneeded headers system_keyspace: Move topology_mutation_builder to storage_service system_keyspace: Move group0_upgrade_state conversions to group0 code	2023-04-06 16:39:20 +03:00
Kamil Braun	c2a2996c2b	docs: cleaning up after failed membership change After a failed topology operation, like bootstrap / decommission / removenode, the cluster might contain a garbage entry in either token ring or group 0. This entry can be cleaned-up by executing removenode on any other node, pointing to the node that failed to bootstrap or leave the cluster. Document this procedure, including a method of finding the host ID of a garbage entry. Add references in other documents. Fixes: #13122 Closes #13186	2023-04-06 13:48:37 +02:00
Botond Dénes	0a46a574e6	Merge 'Topology: introduce nodes' from Benny Halevy As a first step towards using host_id to identify nodes instead of ip addresses this series introduces a node abstraction, kept in topology, indexed by both host_id and endpoint. The revised interface also allows callers to handle cases where nodes are not found in the topology more gracefully by introducing `find_node()` functions that look up nodes by host_id or inet_address and also get a `must_exist` parameter that, if false (the default parameter value) would return nullptr if the node is not found. If true, `find_node` throws an internal error, since this indicates a violation of an internal assumption that the node must exist in the topology. Callers that may handle missing nodes, should use the more permissive flavor and handle the !find_node() case gracefully. Closes #11987 * github.com:scylladb/scylladb: topology: add node state topology: remove dead code locator: add class node topology: rename update_endpoint to add_or_update_endpoint topology: define get_{rack,datacenter} inline shared_token_metadata: mutate_token_metadata: replicate to all shards locator: endpoint_dc_rack: refactor default_location locator: endpoint_dc_rack: define default operator== test: storage_proxy_test: provide valid endpoint_dc_rack	2023-04-06 13:47:22 +03:00
Pavel Emelyanov	18333b4225	system_keyspace.hh: Remove unneeded headers Now this header can replace lots of used types with plain forward declarations Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-06 12:37:00 +03:00
Pavel Emelyanov	1af373cf0a	system_keyspace: Move topology_mutation_builder to storage_service The latter is the only user of the class. This keeps system keyspace code free from unrelated logic and from raft::server_id type. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-06 12:36:02 +03:00
Pavel Emelyanov	45de375126	system_keyspace: Move group0_upgrade_state conversions to group0 code In order to keep system keyspace free from group0 logic and from the service::group0_upgrade_state type Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-04-06 12:35:07 +03:00
Kefu Chai	0d4ffe1d69	scripts/refresh-submodules.sh: include all commits in summary before this change, we suse `git submodule summary ${submodule}` for collecting the titles of commits in between current HEAD and origin/master. normally, this works just fine. but it fails to collect all commits if the origin/master happens to reference a merge commit. for instance, if we have following history like: 1. merge foo 2. bar 3. foo 4. baz <--- submodule is pointing here. `git submodule summary` would just print out the titles of commits of 1 and 3. so, in this change, instead of relying on `git submodule summary`, we just collect the commits using `git log`. but we preserve the output format used by `git submodule summary` to be consistent with the previous commits bumping up the submodules. please note, in this change instead of matching the output of `git submodule summary`, we use `git merge-base --is-ancestor HEAD origin/master` to check if we are going to create a fastforward change, this is less fragile. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13366	2023-04-06 11:27:14 +03:00
Botond Dénes	9a02315c6b	Merge 'Compaction reevaluation bug fixes' from Raphael "Raph" Carvalho A problem in compaction reevaluation can cause the SSTable set to be left uncompacted for indefinite amount of time, potentially causing space and read amplification to be suboptimal. Two revaluation problems are being fixed, one after off-strategy compaction ended, and another in compaction manager which intends to periodically reevaluate a need for compaction. Fixes https://github.com/scylladb/scylladb/issues/13429. Fixes https://github.com/scylladb/scylladb/issues/13430. Closes #13431 * github.com:scylladb/scylladb: compaction: Make compaction reevaluation actually periodic replica: Reevaluate regular compaction on off-strategy completion	2023-04-05 13:51:21 +03:00
Tomasz Grabiec	9802bb6564	Merge 'Remove explicit flush() from sstable component writer' from Pavel Emelyanov Writing into sstable component output stream should be done with care. In particular -- flushing can happen only once right before closing the stream. Flushing the stream in between several writes is not going to work, because file stream would step on unaligned IO and S3 upload stream would send completion message to the server and would lose any subsequent write. Most of the file_writer users already obey that and flush the writer once right before closing it. The do_write_simple() is extra careful about exceptions handling, but it's an overkill (see first patch). It's better to make file_writer API explicitly lack the ability to flush itself by flushing the stream when closing the writer. Closes #13338 * github.com:scylladb/scylladb: sstables: Move writer flush into close (and remove it) sstables: Relax exception handling in do_write_simple	2023-04-05 12:09:31 +02:00
Tomasz Grabiec	bbabf07f69	Merge 'test/boost/multishard_mutation_query: use random schema' from Botond Dénes This test currently uses `test/lib/test_table.hh` to generate data for its test cases. This data generation facility is used by no other tests. Worse, it is redundant as we already have a random data generator with fixed schema, in `test/lib/mutation_source_test.hh`. So in this series, we migrate the test cases in said test file to random schema and its random data generation facilities. These are used by several other test cases and using random schema allows us to cover a wider (quasi-infinite) number of possibilities. After migrating all tests away from it, `test/lib/test_table.hh` is removed. This series also reduces the runtime of `fuzzy_test` drastically. It should now run in a few minutes or even in seconds (depending on the machine). Fixes: #12944 Closes #12574 * github.com:scylladb/scylladb: test/lib: rm test_table.hh test/boos/multishard_mutation_query_test: migrate other tests to random schema test/boost/multishard_mutation_query_test: use ks keyspace test/boost/multishard_mutation_query_test: improve test pager test/boost/multishard_mutation_query_test: refactor fuzzy_test test/boost: add multishard_mutation_query_test more memory types/user: add get_name() accessor test/lib/random_schema: add create_with_cql() test/lib/random_schema: fix udt handling test/lib/random_schema: type_generator(): also generate frozen types test/lib/random_schema: type_generator(): make static column generation conditional test/lib/random_schema: type_generator(): don't generate duration_type for keys test/lib/random_schema: generate_random_mutations(): add overload with seed test/lib/random_schema: generate_random_mutations(): respect range tombstone count param test/lib/random_schema: generate_random_mutations(): add yields test/lib/random_schema: generate_random_mutations(): fix indentation test/lib/random_schema: generate_random_mutations(): coroutinize method test/lib/random_schema: generate_random_mutations(): expand comment	2023-04-05 10:32:58 +02:00
Michał Chojnowski	df0905357e	mutation_partition_v2: add sentinel to the tracker after adding it to the tree Every tracker insertion has to have a corresponding removal or eviction, (otherwise the number of rows in the tracker will be misaccounted). If we add the row to the tracker before adding it to the tree, and the tree insertion fails (with bad_alloc), this contract will be violated. Fix that. Note: the problem is currently irrelevant because an exception during sentinel insertion will abort the program anyway. Closes #13336	2023-04-05 09:52:44 +02:00
Raphael S. Carvalho	457c772c9c	replica: Make compaction_group responsible for deleting off-strategy compaction input Compaction group is responsible for deleting SSTables of "in-strategy" compactions, i.e. regular, major, cleanup, etc. Both in-strategy and off-strategy compaction have their completion handled using the same compaction group interface, which is compaction_group::table_state::on_compaction_completion(..., sstables::offstrategy offstrategy) So it's important to bring symmetry there, by moving the responsibility of deleting off-strategy input, from manager to group. Another important advantage is that off-strategy deletion is now throttled and gated, allowing for better control, e.g. table waiting for deletion on shutdown. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #13432	2023-04-05 08:37:48 +03:00
Botond Dénes	f7421aab2c	Merge 'cmake: sync with `configure.py` (16/n)' from Kefu Chai this is the 15th changeset of a series which tries to give an overhaul to the CMake building system. this series has two goals: - to enable developer to use CMake for building scylla. so they can use tools (CLion for instance) with CMake integration for better developer experience - to enable us to tweak the dependencies in a simpler way. a well-defined cross module / subsystem dependency is a prerequisite for building this project with the C++20 modules. also, i just found that the scylla executable built with cmake building system segfault in master HEAD. like ``` AddressSanitizer:DEADLYSIGNAL ================================================================= ==3974496==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000000000000 bp 0x7ffd48549f70 sp 0x7ffd48549728 T0) ==3974496==Hint: pc points to the zero page. ==3974496==The signal is caused by a READ memory access. ==3974496==Hint: address points to the zero page. #0 0x0 (<unknown module>) #1 0x14e785a5 in wasmtime_runtime::traphandlers::unix::trap_handler::h1f510afc2968497f /home/kefu/.cargo/registry/src/mirrors.sjtug.sjtu.edu.cn-7a04d2510079875b/wasmtime-runtime-5.0.1/src/traphandlers/unix.rs:159:9 #2 0x7f3462e5eb9f (/lib64/libc.so.6+0x3db9f) (BuildId: 6107835fa7d4725691b2b7f6aaee7abe09f493b2) AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV (<unknown module>) ==3974496==ABORTING Aborting on shard 0. Backtrace: 0xd16c38a 0x13c5aab0 0x13b9821e 0x13c2fdc7 /lib64/libc.so.6+0x3db9f /lib64/libc.so.6+0x8eb93 /lib64/libc.so.6+0x3daed /lib64/libc.so.6+0x2687e 0xd1e5f8a 0xd1e3d34 0xd1ca059 0xd1c5e29 0xd1c5605 0x14e785a5 /lib64/libc.so.6+0x3db9f ``` decoded: ``` __interceptor_backtrace at ??:? void seastar::backtrace<seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}&&) at /home/kefu/dev/scylladb/seastar/include/seastar/util/backtrace.hh:60 seastar::backtrace_buffer::append_backtrace() at /home/kefu/dev/scylladb/seastar/src/core/reactor.cc:778 (inlined by) seastar::print_with_backtrace(seastar::backtrace_buffer&, bool) at /home/kefu/dev/scylladb/seastar/src/core/reactor.cc:808 seastar::print_with_backtrace(char const, bool) at /home/kefu/dev/scylladb/seastar/src/core/reactor.cc:820 (inlined by) seastar::sigabrt_action() at /home/kefu/dev/scylladb/seastar/src/core/reactor.cc:3882 (inlined by) operator() at /home/kefu/dev/scylladb/seastar/src/core/reactor.cc:3858 (inlined by) __invoke at /home/kefu/dev/scylladb/seastar/src/core/reactor.cc:3854 /lib64/libc.so.6: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=6107835fa7d4725691b2b7f6aaee7abe09f493b2, for GNU/Linux 3.2.0, not stripped __GI___sigaction at :? __pthread_kill_implementation at ??:? __GI_raise at :? __GI_abort at :? __sanitizer::Abort() at ??:? __sanitizer::Die() at ??:? __asan::ScopedInErrorReport::~ScopedInErrorReport() at ??:? __asan::ReportDeadlySignal(__sanitizer::SignalContext const&) at ??:? __asan::AsanOnDeadlySignal(int, void, void) at ??:? wasmtime_runtime::traphandlers::unix::trap_handler at /home/kefu/.cargo/registry/src/mirrors.sjtug.sjtu.edu.cn-7a04d2510079875b/wasmtime-runtime-5.0.1/src/traphandlers/unix.rs:159 __GI___sigaction at :? ``` this led me to this change. but unfortunately, this changeset does not address the segfault. will continue the investigation in my free cycles. Closes #13434 github.com:scylladb/scylladb: build: cmake: include cxx.h with relative path build: cmake: set stack frame limits build: cmake: pass -fvisibility=hidden to compiler build: cmake: use -O0 on aarch64, otherwise -Og	2023-04-05 06:57:23 +03:00
Yaron Kaikov	c80ab78741	doc: update supported os for 2022.1 ubuntu22.04 is already supported on both `5.0` and `2022.1` updating the table Closes #13340	2023-04-05 06:43:58 +03:00
Pavel Emelyanov	f5de0582c8	alternator,util: Move aws4-hmac-sha256 signature generator to util S3 client cannot perform anonymous multipart uploads into any real S3 buckets regardless of their configuration. Since multipart upload is essential part of the sstables backend, we need to implement the authorisation support for the client early. (side note): with minio anonymous multipart upload works, with aws s3 anonymous PUT and DELETE can be configured, it's exactly the combination of aws + multipart upload that does need authorization. Fortunately, the signature generation and signature checking code is symmetrical and we have the checking option already in alternator :) So what this patch does is just moves the alternator::get_signature() helper into utils/. A sad side effect of that is all tests now need to link with gnutls :( that is used to compute the hash value itself. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13428	2023-04-04 18:24:48 +03:00
Nadav Har'El	aeabfcb93f	Merge 'Revert scylla sstable schema improvements' from Botond Dénes This PR reverts the scylla sstable schema loading improvements as they fail in CI every other run. I am already working on fixes for these but I am not sure I understand all the failures so it is best to revert and re-post the series later. Fixes: #13404 Fixes: #13410 Closes #13419 * github.com:scylladb/scylladb: Revert "Merge 'tool/scylla-sstable: more flexibility in obtaining the schema' from Botond Dénes" Revert "tools/schema_loader: don't require results from optional schema tables"	2023-04-04 18:22:14 +03:00
Anna Stuchlik	447ce58da5	doc: update Raft doc for versions 5.2 and 2023.1 Fixes https://github.com/scylladb/scylladb/issues/13345 Fixes https://github.com/scylladb/scylladb/issues/13421 This commit updates the Raft documentation page to be up to date in versions 5.2 and 2023.1. - Irrelevant information about previous releases is removed. - Some information is clarified. - Mentions of version 5.2 are either removed (if possible) or version 2023.1 is added. Closes #13426	2023-04-04 15:15:56 +02:00
Raphael S. Carvalho	156ac0a67a	compaction: Make compaction reevaluation actually periodic The manager intended to periodically reevaluate compaction need for each registered table. But it's not working as intended. The reevaluation is one-off. This means that compaction was not kicking in later for a table, with low to none write activity, that had expired data 1 hour from now. Also make sure that reevaluation happens within the compaction scheduling group. Fixes #13430. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-04 09:16:19 -03:00
Raphael S. Carvalho	2652b41606	replica: Reevaluate regular compaction on off-strategy completion When off-strategy compaction completes, regular compaction is not triggered. If off-strategy output causes the table's SSTable set to not conform the strategy goal, it means that read and space amplification will be suboptimal until the next compaction kicks in, which can take undefinite amount of time (e.g. when active memtable is flushed). Let's reevaluate compaction on main SSTable set when off-strategy ends. Fixes #13429. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-04-04 09:16:16 -03:00
Kefu Chai	dceb364c5c	build: cmake: include cxx.h with relative path before this change, the wasm binding source files includes the cxxbridge header file of `cxx.h` with its full path. to better mirror the behavior of configure.py, let's just include this header file with relative path. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-04 15:33:20 +08:00
Kefu Chai	ecd5bf98d9	build: cmake: set stack frame limits * transpose include(mode.common) and include (mode.${build_mode}), so the former can reference the value defined by the latter. * set stack_usage_threshold for supported build modes. please note, this compiler option (-Wstack-usage=<bytes>) is only supported by GCC so far. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-04 15:33:20 +08:00
Kefu Chai	6cc8800c85	build: cmake: pass -fvisibility=hidden to compiler this mirrors the behavior of `configure.py`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-04 15:33:20 +08:00
Kefu Chai	066e9567ee	build: cmake: use -O0 on aarch64, otherwise -Og this addresses an oversight in `b234c839e4`, which is supposed to mirror the behavior of `configure.py`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-04-04 15:33:20 +08:00
Anna Stuchlik	595325c11b	doc: add upgrade guide from 5.2 to 2023.1 Related: https://github.com/scylladb/scylla-enterprise/issues/2770 This commit adds the upgrade guide from ScyllaDB Open Source 5.2 to ScyllaDB Enterprise 2023.1. This commit does not cover metric updates (the metrics file has no content, which needs to be added in another PR). As this is an upgrade guide, this commit must be merged to master and backported to branch-5.2 and branch-2023.1 in scylla-enterprise.git. Closes #13294	2023-04-04 08:24:00 +03:00
Botond Dénes	8167f11a23	Merge 'Move compaction manager tasks out of compaction manager' from Aleksandra Martyniuk Task manager compaction tasks that cover compaction group compaction need access to compaction_manager::tasks. To avoid circular dependency and be able to rely on forward declaration, task needs to be moved out of compaction manager. To avoid naming confusion compaction_manager::task is renamed. Closes #13226 * github.com:scylladb/scylladb: compaction: use compaction namespace in compaction_manager.cc compaction: rename compaction::task compaction: move compaction_manager::task out of compaction manager compaction: move sstable_task definition to source file	2023-04-03 15:40:42 +03:00
Botond Dénes	54c0a387a2	Revert "Merge 'tool/scylla-sstable: more flexibility in obtaining the schema' from Botond Dénes" This reverts commit `32fff17e19`, reversing changes made to `164afe14ad`. This series proved to be problematic, the new test introduced by it failing quite often. Revert it until the problems are tracked down and fixed.	2023-04-03 13:54:00 +03:00
Botond Dénes	04b1219694	Revert "tools/schema_loader: don't require results from optional schema tables" This reverts commit `c15f53f971`. Said commit is based on a commit which we want to revert because it's unit test if flaky.	2023-04-03 13:53:06 +03:00
Petr Gusev	09636b20f3	scylla_cluster.py: optimize node logs reading There are two occasions in scylla_cluster where we read the node logs, and in both of them we read the entire file in memory. This is not efficient and may cause an OOM. In the first case we need the last line of the log file, so we seek at the end and move backwards looking for a new line symbol. In the second case we look through the log file to find the expected_error. The readlines() method returns a Python list object, which means it reads the entire file in memory. It's sufficient to just remove it since iterating over the file instance already yields lines lazily one by one. This is a follow-up for #13134. Closes #13399	2023-04-03 12:28:08 +02:00
Marcin Maliszkiewicz	99f8d7dcbe	db: view: use deferred_close for closing staging_sstable_reader When consume_in_thread throws the reader should still be closed. Related https://github.com/scylladb/scylla-enterprise/issues/2661 Closes #13398 Refs: scylladb/scylla-enterprise#2661 Fixes: #13413	2023-04-03 09:02:55 +03:00
Botond Dénes	ca062d1fba	Merge ' mutation: replace operator<<(..) with fmt formatter' from Kefu Chai this is a part of a series migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `position_in_partition` and `partition_region` without using ostream<<. also, this change removes `operator<<(ostream, const position_in_partition_view&)` , `operator<<(ostream, const partition_region&)` along with their callers. Refs #13245 Closes #13391 * github.com:scylladb/scylladb: mutation: drop operator<< for position_in_partition and friends partition_snapshot_row_cursor: do not use operator<< when printing position mutation: specialize fmt::formatter<position_in_partition> mutation: specialize fmt::formatter<partition_region>	2023-04-03 08:34:55 +03:00
Kefu Chai	6c37829224	wasm: add noexcept specifier for alien::run_on() as alien::run_on() requires the function to be noexcept, let's make this explicit. also, this paves the road to the type constraint added to `alien::run_on()`. the type contraint will enforce this requirement to the function passed to `alien::run_on()`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13375	2023-04-03 08:19:00 +03:00
Botond Dénes	36e53d571c	Merge 'Treewide use-after-move bug fixes' from Raphael "Raph" Carvalho That's courtersy of `153813d3b8`, which annotates Seastar smart pointer classes with Clang's consumed attributes, to help Clang to statically spot use-after-move bugs. Closes #13386 * github.com:scylladb/scylladb: replica: Fix use-after-move in table::make_streaming_reader index/built_indexes_virtual_reader.hh: Fix use-after-move db/view/build_progress_virtual_reader: Fix use-after-move sstables: Fix use-after-move when making reader in reverse mode	2023-04-03 06:57:54 +03:00
Benny Halevy	c17df1759e	topology: add node state Add a simple node state model with: `joining`, `normal`, `leaving`, and `left` states to help managing nodes during replace with the the same ip address. Later on, this could also help prevent nodes that were decommissioned, removed, or replaced from rejoining the cluster. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-02 20:18:31 +03:00
Benny Halevy	027f188a97	topology: remove dead code Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-02 20:13:04 +03:00
Benny Halevy	f3d5df5448	locator: add class node And keep per node information (idx, host_id, endpoint, dc_rack, is_pending) in node objects, indexed by topology on several indices like: idx, host_id, endpoint, current/pending, per dc, per dc/rack. The node index is a shorthand identifier for the node. node* and index are valid while the respective topology instance is valid. To be used, the caller must hold on to the topology / token_metadata object (e.g. via a token_metadata_ptr or effective_replication_map) Refs #6403 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> topology: add node idx Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-02 20:13:02 +03:00
Benny Halevy	006e02410f	topology: rename update_endpoint to add_or_update_endpoint To reflect what it does, Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-02 20:08:03 +03:00
Benny Halevy	df1c92649e	topology: define get_{rack,datacenter} inline Define get_location() that gets the location for the local node, and use either this entry point or get_location(inet_address) to get the respective dc or rack. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-02 20:07:49 +03:00
Benny Halevy	fd1a2591b5	shared_token_metadata: mutate_token_metadata: replicate to all shards storage_service::replicate_to_all_cores has a sophisticated way to mutate the token_metadata and effective_replication_map on shard 0 and cloning those to all other shards, applying the changes only mutate and clone succeeded on all shards so we don't end up with only some of the shards with the mutated copy if an error happend mid-way (and then we would need to roll-back the change for exception safety). shared_token_metadata::mutate_token_metadata is currently only called from a unit test that needs to mutate the token metadata only on shard 0, but a following patch will require doing that on all shards. This change adds this capbility by enforcing the call to be on shard 0m mutating the token_metdata into a temporary pending copy and cloning it on all other shards. Only then, when all shard succeeded, set the modified token_metadata on all shards. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-02 20:07:17 +03:00
Benny Halevy	9cce01a12c	locator: endpoint_dc_rack: refactor default_location Refactor the thread_local default_location out of topology::get_location so it can be used elsewhere. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-02 20:06:53 +03:00
Benny Halevy	5ba5371631	locator: endpoint_dc_rack: define default operator== and get rid of the ad-hoc implementation in network_topology_strategy. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-02 20:06:52 +03:00
Benny Halevy	5874a0d0ca	test: storage_proxy_test: provide valid endpoint_dc_rack Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-02 19:13:05 +03:00
Benny Halevy	ca61d88764	storage_service: node ops: standardize sync_nodes selection Use token_metadata get_endpoint_to_host_id_map_for_reading to get all normal token owners for all node operations, rather than using gossip for some operation and token_metadata for others. Fixes #12862 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-04-02 09:17:07 +03:00
Raphael S. Carvalho	d2d151ae5b	Fix use-after-move when initializing row cache with dummy entry Courtersy of clang-tidy: row_cache.cc:1191:28: warning: 'entry' used after it was moved [bugprone-use-after-move] _partitions.insert(entry.position().token().raw(), std::move(entry), dht::ring_position_comparator{_schema}); ^ row_cache.cc:1191:60: note: move occurred here _partitions.insert(entry.position().token().raw(), std::move(entry), dht::ring_position_comparator{_schema}); ^ row_cache.cc:1191:28: note: the use and move are unsequenced, i.e. there is no guarantee about the order in which they are evaluated _partitions.insert(entry.position().token().raw(), std::move(entry), dht::ring_position_comparator{*_schema}); The use-after-move is UB, as for it to happen, depends on evaluation order. We haven't hit it yet as clang is left-to-right. Fixes #13400. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #13401	2023-03-31 19:46:53 +03:00
Botond Dénes	c15f53f971	tools/schema_loader: don't require results from optional schema tables When loading a schema from disk, only the `tables` and `columns` tables are required to have an entry to the loaded schema. All the others are optional. Yet the schema loader expects all the tables to have a corresponding entry, which leads to errors when trying to load a schema which doesn't. Relax the loader to only require existing entries in the two mandatory tables and not the others. Closes #13393	2023-03-31 16:35:42 +02:00
Kefu Chai	c24a9600af	docs: dev: correct a typo s/By expending/By expanding/ Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13392	2023-03-31 17:19:08 +03:00
Raphael S. Carvalho	04932a66d3	replica: Fix use-after-move in table::make_streaming_reader Variant used by streaming/stream_transfer_task.cc: , reader(cf.make_streaming_reader(cf.schema(), std::move(permit_), prs)) as full slice is retrieved after schema is moved (clang evaluates left-to-right), the stream transfer task can be potentially working on a stale slice for a particular set of partitions. static report: In file included from replica/dirty_memory_manager.cc:6: replica/database.hh:706:83: error: invalid invocation of method 'operator->' on object 'schema' while it is in the 'consumed' state [-Werror,-Wconsumed] return make_streaming_reader(std::move(schema), std::move(permit), range, schema->full_slice()); Fixes #13397. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-31 08:44:46 -03:00
Raphael S. Carvalho	f8df3c72d4	index/built_indexes_virtual_reader.hh: Fix use-after-move static report: ./index/built_indexes_virtual_reader.hh:228:40: warning: invalid invocation of method 'operator->' on object 's' while it is in the 'consumed' state [-Wconsumed] _db.find_column_family(s->ks_name(), system_keyspace::v3::BUILT_VIEWS), Fixes #13396. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-31 08:41:44 -03:00
Raphael S. Carvalho	1ecba373d6	db/view/build_progress_virtual_reader: Fix use-after-move use-after-free in ctor, which potentially leads to a failure when locating table from moved schema object. static report In file included from db/system_keyspace.cc:51: ./db/view/build_progress_virtual_reader.hh:202:40: warning: invalid invocation of method 'operator->' on object 's' while it is in the 'consumed' state [-Wconsumed] _db.find_column_family(s->ks_name(), system_keyspace::v3::SCYLLA_VIEWS_BUILDS_IN_PROGRESS), Fixes #13395. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-31 08:40:30 -03:00
Raphael S. Carvalho	213eaab246	sstables: Fix use-after-move when making reader in reverse mode static report: sstables/mx/reader.cc:1705:58: error: invalid invocation of method 'operator' on object 'schema' while it is in the 'consumed' state [-Werror,-Wconsumed] legacy_reverse_slice_to_native_reverse_slice(schema, slice.get()), pc, std::move(trace_state), fwd, fwd_mr, monitor); Fixes #13394. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-31 08:39:11 -03:00
Kefu Chai	6e956c5358	mutation: drop operator<< for position_in_partition and friends now that all their callers are removed, let's just drop these operators. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-31 19:03:14 +08:00
Kefu Chai	76dde9fd50	partition_snapshot_row_cursor: do not use operator<< when printing position in order to prepare for dropping the `operator<<()` for `position_in_partition_view`, let's use fmtlib to print `position()`. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-31 19:03:14 +08:00
Kefu Chai	4ec4859179	mutation: specialize fmt::formatter<position_in_partition> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print - position_in_partition - position_in_partition_view - position_in_partition_view::printer without the help of fmt::ostream. their `operator<<(ostream,..)` are reimplemented using fmtlib accordingly to ease the review. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-31 19:03:14 +08:00
Kefu Chai	500eeeb12c	mutation: specialize fmt::formatter<partition_region> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `partition_region` with the help of fmt::ostream. to help with the review process, the corresponding `to_string()` is dropped, and its callers now switch over to `fmt::to_string()` in this change as well. to use `fmt::to_string()` helps with consolidating all places to use fmtlib for printing/formatting. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-31 19:03:14 +08:00
Tomasz Grabiec	99cb948eac	direct_failure_detector: Avoid throwing exceptions in the success path sleep_abortable() is aborted on success, which causes sleep_aborted exception to be thrown. This causes scylla to throw every 100ms for each pinged node. Throwing may reduce performance if happens often. Also, it spams the logs if --logger-log-level exception=trace is enabled. Avoid by swallowing the exception on cancellation. Fixes #13278. Closes #13279	2023-03-31 12:40:43 +02:00
Alejo Sanchez	81b40c10de	test/pylib: RandomTables.add_column with value column When adding extra columns in a test, make them value column. Name them with the "v_" prefix and use the value column number counter. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Closes #13271	2023-03-31 11:19:49 +02:00
Alejo Sanchez	e3b462507d	test/pylib: topology: support clusters of initial size 0 To allow tests with custom clusters, allow configuration of initial cluster size of 0. Add a proof-of-concept test to be removed later. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Closes #13342	2023-03-31 11:17:58 +02:00
Benny Halevy	56be654edc	storage_service: get_ignore_dead_nodes_for_replace: make static and rename to parse_node_list Let the caller pass the string to parse to the function rather than the function itself get to it via _db.local().get_config() so it could be used as a general purpose function. Make it static now that it doesn't require an instance. Rename to `parse_node_list` as that's what the function does. It doesn't care if the nodes are to be ignored or something else (e.g. removed), they only need to be in token_metadata. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-31 10:20:17 +03:00
Kefu Chai	e107b31d23	test: sstable: remove unused class in sstable test generation_for_sharded_test is not used by any of these sstable tests, so let's drop it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13388	2023-03-31 08:02:22 +03:00
Botond Dénes	f777916055	Merge 'Offstrategy keyspace compaction task' from Aleksandra Martyniuk Task manager task implementations of classes that cover offstrategy keyspace compaction which can be start through /storage_service/keyspace_compaction/ api. Top level task covers the whole compaction and creates child tasks on each shard. Closes #12713 * github.com:scylladb/scylladb: test: extend test_compaction_task.py to test offstrategy compaction compaction: create task manager's task for offstrategy keyspace compaction on one shard compaction: create task manager's task for offstrategy keyspace compaction compaction: create offstrategy_compaction_task_impl	2023-03-31 07:09:17 +03:00
Pavel Emelyanov	7d6ab5c84d	code: Remove some headers from query_processor.hh The forward_service.hh and raft_group0_client.hh can be replaced with forward declarations. Few other files need their previously indirectly included headers back. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13384	2023-03-31 07:08:41 +03:00
Tomasz Grabiec	4d6443e030	Merge 'Schema commitlog separate dir' from Gusev Petr The commitlog api originally implied that the commitlog_directory would contain files from a single commitlog instance. This is checked in segment_manager::list_descriptors, if it encounters a file with an unknown prefix, an exception occurs in `commitlog::descriptor::descriptor`, which is logged with the `WARN` level. A new schema commitlog was added recently, which shares the filesystem directory with the main commitlog. This causes warnings to be emitted on each boot. This patch solves the warnings problem by moving the schema commitlog to a separate directory. In addition, the user can employ the new `schema_commitlog_directory` parameter to move the schema commitlog to another disk drive. This is expected to be released in 5.3. As #13134 (raft tables->schema commitlog) is also scheduled for 5.3, and it already requires a clean rolling restart (no cl segments to replay), we don't need to specifically handle upgrade here. Fixes: #11867 Closes #13263 * github.com:scylladb/scylladb: commitlog: use separate directory for schema commitlog schema commitlog: fix commitlog_total_space_in_mb initialization	2023-03-30 23:48:58 +02:00
Petr Gusev	0152c000bb	commitlog: use separate directory for schema commitlog The commitlog api originally implied that the commitlog_directory would contain files from a single commitlog instance. This is checked in segment_manager::list_descriptors, if it encounters a file with an unknown prefix, an exception occurs in commitlog::descriptor::descriptor, which is logged with the WARN level. A new schema commitlog was added recently, which shares the filesystem directory with the main commitlog. This causes warnings to be emitted on each boot. This patch solves the warnings problem by moving the schema commitlog to a separate directory. In addition, the user can employ the new schema_commitlog_directory parameter to move the schema commitlog to another disk drive. By default, the schema commitlog directory is nested in the commitlog_directory. This can help avoid problems during an upgrade if the commitlog_directory in the custom scylla.yaml is located on a separate disk partition. This is expected to be released in 5.3. As #13134 (raft tables->schema commitlog) is also scheduled for 5.3, and it already requires a clean rolling restart (no cl segments to replay), we don't need to specifically handle upgrade here. Fixes: #11867	2023-03-30 21:55:50 +04:00
Petr Gusev	f31bd26971	schema commitlog: fix commitlog_total_space_in_mb initialization It seems there was a typo here, which caused commitlog_total_space_in_mb to always be zero and the schema commitlog to be effectively unlimited in size.	2023-03-30 21:55:50 +04:00
Botond Dénes	207dcbb8fa	Merge 'sstables: prepare for uuid-based generation_type' from Benny Halevy Preparing for #10459, this series defines sstables::generation_type::int_t as `int64_t` at the moment and use that instead of naked `int64_t` variables so it can be changed in the future to hold e.g. a `std::variant<int64_t, sstables::generation_id>`. sstables::new_generation was defined to generation new, unique generations. Currently it is based on incrementing a counter, but it can be extended in the future to manufacture UUIDs. The unit tests are cleaned up in this series to minimize their dependency on numeric generations. Basically, they should be used for loading sstables with hard coded generation numbers stored under `test/resource/sstables`. For all the rest, the tests should use existing and mechanisms introduced in this series such as generation_factory, sst_factory and smart make_sstable methods in sstable_test_env and table_for_tests to generate new sstables with a unique generation, and use the abstract sst->generation() method to get their generation if needed, without resorting the the actual value it may hold. Closes #12994 * github.com:scylladb/scylladb: everywhere: use sstables::generation_type test: sstable_test_env: use make_new_generation sstable_directory::components_lister::process: fixup indentation sstables: make highest_generation_seen return optional generation replica: table: add make_new_generation function replica: table: move sstable generation related functions out of line test: sstables: use generation_type::int_t sstables: generation_type: define int_t	2023-03-30 17:05:07 +03:00
Pavel Emelyanov	92318fdeae	Merge 'Initialize Wasm together with query_processor' from Wojciech Mitros The wasm engine is moved from replica::database to the query_processor. The wasm instance cache and compilation thread runner were already there, but now they're also initialized in the query_processor constructor. By moving the initialization to the constructor, we can now be certain that all wasm-related objects (wasm instance cache, compilation thread runner, and wasm engine, which was already passed in the constructor) are initialized when we try to use them because we have to use the query processor to access them anyway. The change is also motivated by the fact that we're planning to take Wasm UDFs out of experimental, after which they should stop getting special treatment. Closes #13311 * github.com:scylladb/scylladb: wasm: move wasm initialization to query_processor constructor wasm: return wasm instance cache as a reference instead of a pointer wasm: move wasm engine to query_processor	2023-03-30 14:30:23 +03:00
Nadav Har'El	59ab9aac44	Merge 'functions: reframe aggregate functions in terms of scalar functions' from Avi Kivity Currently, aggregate functions are implemented in a statefull manner. The accumulator is stored internally in an aggregate_function::aggregate, requiring each query to instantiate new instances (see aggregate_function_selector's constructor, and note how it's called from selector::new_instance()). This makes aggregates hard to use in expressions, since expressions are stateless (with state only provided to evaluate()). To facilitate migration towards stateless expressions, we define a stateless_aggregate_function (modeled after user-defined aggregates, which are already stateless). This new struct defines the aggregate in terms of three scalar functions: one to aggregate a new input into an accumulator (provided in the first parameter), one to finalize an accumulator into a result, and one to reduce two accumulators for parallelized aggregation. All existing native aggregate functions are converted to the new model, and the old interface is removed. This series does not yet convert selectors to expressions, but it does remove one of the obstacles. Performance evaluation: I created a table with a million ints on a single-node cluster, and ran the avg() function on them. I measured the number of instructions executed with `perf stat -p $(pgrep scylla) -e instructions` while the query was running. The query executed from cache, memtables were flushed beforehand. The instruction count per row increased from roughly 49k to roughly 52k, indicating 3k extra instructions per row. While 3k instructions to execute a function is huge, it is currently dwarfed by other overhead (and will be even less important in a cluster where it CL>1 will cause non-coordinator code to run multiple times). Closes #13105 * github.com:scylladb/scylladb: cql3/selection, forward_service: use use stateless_aggregate_function directly db: functions: fold stateless_aggregate_function_adapter into aggregate_function cql3: functions: simplify accumulator_for template cql3: functions: base user-defined aggregates on stateless aggregates cql3: functions: drop native_aggregate_function cql3: functions: reimplement count(column) statelessly cql3: functions: reimplement avg() statelessly cql3: functions: reimplement sum() statelessly cql3: functions: change wide accumulator type to varint cql3: functions: unreverse types for min/max cql3: functions: rename make_{min,max}_dynamic_function cql3: functions: reimplement min/max statelessly cql3: functions: reimplement count(*) statelessly cql3: functions: simplify creating native functions even more cql3: functions: add helpers for automating marshalling for scalar functions types: fix big_decimal constructor from literal 0 cql3: functions: add helper class for internal scalar functions db: functions: add stateless aggregate functions db, cql3: move scalar_function from cql3/functions to db/functions	2023-03-30 13:58:47 +03:00
Aleksandra Martyniuk	306d44568f	test: extend test_compaction_task.py to test offstrategy compaction	2023-03-30 10:52:27 +02:00
Aleksandra Martyniuk	8afa54d4f6	compaction: create task manager's task for offstrategy keyspace compaction on one shard Implementation of task_manager's task that covers local offstrategy keyspace compaction.	2023-03-30 10:49:09 +02:00
Aleksandra Martyniuk	73860b7c9d	compaction: create task manager's task for offstrategy keyspace compaction Implementation of task_manager's task covering offstrategy keyspace compaction that can be started through storage_service api.	2023-03-30 10:44:56 +02:00
Aleksandra Martyniuk	e8ef8a51d5	compaction: create offstrategy_compaction_task_impl offstrategy_compaction_task_impl serves as a base class of all concrete offstrategy compaction task classes.	2023-03-30 10:28:17 +02:00
Nadav Har'El	32fff17e19	Merge 'tool/scylla-sstable: more flexibility in obtaining the schema' from Botond Dénes `scylla-sstable` currently has two ways to obtain the schema: * via a `schema.cql` file. * load schema definition from memory (only works for system tables). This meant that for most cases it was necessary to export the schema into a `CQL` format and write it to a file. This is very flexible. The sstable can be inspected anywhere, it doesn't have to be on the same host where it originates form. Yet in many cases the sstable is inspected on the same host where it originates from. In this cases, the schema is readily available in the schema tables on disk and it is plain annoying to have to export it into a file, just to quickly inspect an sstable file. This series solves this annoyance by providing a mechanism to load schemas from the on-disk schema tables. Furthermore, an auto-detect mechanism is provided to detect the location of these schema tables based on the path of the sstable, but if that fails, the tool check the usual locations of the scylla data dir, the scylla confguration file and even looks for environment variables that tell the location of these. The old methods are still supported. In fact, if a `schema.cql` is present in the working directory of the tool, it is preferred over any other method, allowing for an easy force-override. If the auto-detection magic fails, an error is printed to the console, advising the user to turn on debug level logging to see what went wrong. A comprehensive test is added which checks all the different schema loading mechanisms. The documentation is also updated to reflect the changes. This change breaks the backward-compatibility of the command-line API of the tool, as `--system-schema` is now just a flag, the keyspace and table names are supplied separately via the new `--keyspace` and `--table` options. I don't think this will break anybody's workflow as this tools is still lightly used, exactly because of the annoying way the schema has to be provided. Hopefully after this series, this will change. Example: ``` $ ./build/dev/scylla sstable dump-data /var/lib/scylla/data/ks/tbl2-d55ba230b9a811ed9ae8495671e9e4f8/quarantine/me-1-big-Data.db {"sstables":{"/var/lib/scylla/data/ks/tbl2-d55ba230b9a811ed9ae8495671e9e4f8/quarantine//me-1-big-Data.db":[{"key":{"token":"-3485513579396041028","raw":"000400000000","value":"0"},"clustering_elements":[{"type":"clustering-row","key":{"raw":"","value":""},"marker":{"timestamp":1677837047297728},"columns":{"v":{"is_live":true,"type":"regular","timestamp":1677837047297728,"value":"0"}}}]}]}} ``` As seen above, subdirectories like `qurantine`, `staging` etc are also supported. Fixes: https://github.com/scylladb/scylladb/issues/10126 Closes #13075 * github.com:scylladb/scylladb: docs/operating-scylla/admin-tools: scylla-sstable.rst: update schema section test/cql-pytest: test_tools.py: add test for schema loading test/cql-pytest: nodetool.py: add flush_keyspace() tools/scylla-sstable: reform schema loading mechanism tools/schema_loader: add load_schema_from_schema_tables() db/schema_tables: expose types schema	2023-03-30 09:35:59 +03:00
Pavel Emelyanov	886a1392a8	sstables: Move writer flush into close (and remove it) Writing into sstable component output stream should be done with care. In particular -- flushing can happen only once right before closing the stream. Flushing the stream in between several writes is not going to work, because file stream would step on unaligned IO and S3 upload stream would send completion message to the server and would lose any subsequent write. Having said that, it's better to remove the flush() ability from the component writer not to tempt the developers. refs: #13320 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-30 09:34:04 +03:00
Pavel Emelyanov	77169e2647	sstables: Relax exception handling in do_write_simple This effectively reverts `000514e7cc` (sstable: close file_writer if an exception in thrown) because it became obsoleted by `60873d2360` (sstable: file_writer: auto-close in destructor). The change is in fact idempotent. Before the patch writer was closed regardless of write/flush failing or not. After the patch writer will close itself in destrictor for sure. Before the patch an exception from write/flush was caught, then close was called and regardless of close failed or not the former exception was re-thrown. After the patch an exception from write/flush will result inin writer destruction that would ignore close exception (if any). Before the patch throwing close after successfull write/flush re-threw the close exception. After the patch writer will be closed "by hand" and any exception will be reported. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-30 09:32:56 +03:00
Botond Dénes	164afe14ad	Merge 'compound_compat: replace operator<<(..) with fmt formatter ' from Kefu Chai this is a part of a series migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `composite` and `composite_view` without using ostream<<. also, this change removes `operator<<(ostream, const composite&)` , `operator<<(ostream, const composite_view&)` along with their callers. Refs #13245 Closes #13360 * github.com:scylladb/scylladb: compound_compat: remove operator<<(ostream, composite) compound_compat: remove operator<<(ostream, composite_view) sstables: do not use operator<< to print composite_view compound_compat.hh: specialize fmt::formatter<composite> compound_compat.hh: specialize fmt::formatter<composite_view> compound_compat.hh: specialize fmt::formatter<component_view>	2023-03-30 08:47:17 +03:00
Botond Dénes	972b24a969	Merge 'Break the proxy -> database -> [views] -> proxy loop' from Pavel Emelyanov ... and drop usage of global storage proxy from several places of mutate_MV(). This is the last dependency loop around storage proxy left as long as the last user of the global storage proxy. The trouble is that while proxy naturally depends on database, the database SUDDENLY requires proxy to push view updates from the guts of database::do_apply(). Similar loop existed in a form of database -> { large_data_handler, compaction manager } -> system keyspace -> database and it was cut in `917fdb9e53` (Cut database-system_keyspace circular dependency) by introducing a soft dependency link from l. d. handler / compaction manager to system keyspace. The similar solution is proposed here. The database instance gets a soft dependency (shared_ptr) to view_update_generator instance. On start the link is nullptr and pushing view updates is not possible until view_updates_generator starts and plugs itself to the database. The plugging happens naturally, because v.u.generator needs proxy as explicit dependency and, thus, can reach database via proxy. This (seems to) works because tables that need view updates don't start being mutated until late enough, as late as v.u.generator starts. As a nice side effect this allows removing a bunch of global storage proxy usages from mutate_MV() which opens a pretty short way towards de-globalizing proxy (after it only qctx, tracing and schema registry will be left). Closes #13367 * github.com:scylladb/scylladb: view: Drop global storage_proxy usage from mutate_MV() view: Make mutate_MV() method of view_update_generator table: Carry v.u.generator down to populate_views() table: Carry v.u.generator down to do_push_view_replica_updates() view: Keep v.u.generator shared pointer on view_builder::consumer view: Capture v.u.generator on view_updating_consumer lambda view: Plug view update generator to database view: Add view_builder -> view_update_generator dependency view: Add view_update_generator -> sharded<storage_proxy> dependency	2023-03-30 08:29:29 +03:00
Takuya ASADA	160c184d0b	scylla_kernel_check: suppress verbose iotune messages Stop printing verbose iotune messages while the check, just print error message. Fixes #13373. Closes #13362	2023-03-30 07:30:07 +03:00
Pavel Emelyanov	9a66174a94	Merge 'config: make query timeouts live update-able' from Kefu Chai in this change, following query timeouts config options are marked live update-able: - range_request_timeout_in_ms - read_request_timeout_in_ms - counter_write_request_timeout_in_ms - cas_contention_timeout_in_ms - truncate_request_timeout_in_ms - write_request_timeout_in_ms - request_timeout_in_ms as per https://github.com/scylladb/scylladb/issues/10172, > Many users would like to set the driver timers based on server timers. > For example: expire a read timeout before or after the server read time > out. with this change, we are able to set the timeouts on the fly. these timeout options specify how long coordinator waits for the completion of different kinds of operations. but these options are cached by the servers consuming them, so in this series, helpers are added to update the cached values when the options gets modified. also, since the observers are not copyable, sharded_parameter is used to initialize the config when creating these sharded servers. Fixes #12232 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12531 * github.com:scylladb/scylladb: timeout_config: remove unused make_timeout_config() client_state: split the param list of ctor into multi lines redis,thrift,transport: make timeout_config live-updateable config: mark query timeouts live update-able transport: mark cql_server::timeout_config() const auth: remove unused forward declaration redis: drop unused member function transport: drop unused member function thrift: keep a reference of timeout_config in handler_factory redis,thrift,transport: initialize _config with std::move(config) redis,thrift,transport: pass config via sharded_parameter utils: config_file: add a space after `=`	2023-03-29 19:38:26 +03:00
Kefu Chai	4670ba90e5	scripts: remove git-archive-all since we don't build the rpm/deb packages from source tarball anymore, instead we build the rpm/deb packages from precompiled relocatable package. there is no need to keep git-archive-all in the repo. in this change, the git-archive-all script and its license file are removed. they were added for building rpm packages from source tarball in `f87add31a7`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13372	2023-03-29 18:59:23 +03:00
Avi Kivity	472b155d76	Merge 'Allow each compaction group to have its own compaction strategy state' from Raphael "Raph" Carvalho This is important for multiple compaction groups, as they cannot share state that must span a single SSTable set. The solution is about: 1) Decoupling compaction strategy from its state; making compaction_strategy a pure stateless entity 2) Each compaction group storing its own compaction strategy state 3) Compaction group feeds its state into compaction strategy whenever needed Closes #13351 * github.com:scylladb/scylladb: compaction: TWCS: wire up compaction_strategy_state compaction: LCS: wire up compaction_strategy_state compaction: Expose compaction_strategy_state through table_state replica: Add compaction_strategy_state to compaction group compaction: Introduce compaction_strategy_state compaction: add table_state param to compaction_strategy::notify_completion() compaction: LCS: extract state into a separate struct compaction: TWCS: prepare for stateless strategy compaction: TWCS: extract state into a separate struct compaction: add const-qualifier to a few compaction_strategy methods	2023-03-29 18:57:11 +03:00
Pavel Emelyanov	cc262d814b	view: Drop global storage_proxy usage from mutate_MV() Now the mutate_MV is the method of v.u.generator which has reference to the sharded<storage_proxy>. Few helper static wrappers are patched to get the needed proxy or database reference from the mutate_MV call. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-29 18:48:14 +03:00
Pavel Emelyanov	7cabdc54a6	view: Make mutate_MV() method of view_update_generator Nowadays its a static helper, but internally it depends on storage proxy, so it grabs its global instance. Making it a method of view update generator makes it possible to use the proxy dependency from the generator. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-29 18:48:14 +03:00
Pavel Emelyanov	e78e64a920	table: Carry v.u.generator down to populate_views() The method is called by view_builder::consumer when building a view and the consumer already has stable dependency reference on the view updates generator. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-29 18:48:13 +03:00
Botond Dénes	bae62f899d	mutation/mutation_compactor: consume_partition_end(): reset _stop The purpose of `_stop` is to remember whether the consumption of the last partition was interrupted or it was consumed fully. In the former case, the compactor allows retreiving the compaction state for the given partition, so that its compaction can be resumed at a later point in time. Currently, `_stop` is set to `stop_iteration::yes` whenever the return value of any of the `consume()` methods is also `stop_iteration::yes`. Meaning, if the consuming of the partition is interrupted, this is remembered in `_stop`. However, a partition whose consumption was interrupted is not always continued later. Sometimes consumption of a partitions is interrputed because the partition is not interesting and the downstream consumer wants to stop it. In these cases the compactor should not return an engagned optional from `detach_state()`, because there is not state to detach, the state should be thrown away. This was incorrectly handled so far and is fixed in this patch, but overwriting `_stop` in `consume_partition_end()` with whatever the downstream consumer returns. Meaning if they want to skip the partition, then `_stop` is reset to `stop_partition::no` and `detach_state()` will return a disengaged optional as it should in this case. Fixes: #12629 Closes #13365	2023-03-29 17:48:45 +03:00
Aleksandra Martyniuk	0ceee3e4b3	compaction: use compaction namespace in compaction_manager.cc	2023-03-29 15:28:14 +02:00
Takuya ASADA	497dd7380f	create-relocatable-package.py: stop using filter function on tools We introduced exclude_submodules at `19da4a5b8f` to exclude tools/java and tools/jmx since they have their own relocatable packages, so we don't want to package same files twice. However, most of the files under tools/ are not needed for installation, we just need tools/scyllatop. So what we really need to do is "ar.reloc_add('tools/scyllatop')", not excluding files from tools/. related with #13183 Closes #13215	2023-03-29 16:23:43 +03:00
Aleksandra Martyniuk	d7d570e39d	compaction: rename compaction::task To avoid confusion with task manager tasks compaction::task is renamed to compaction::compaction_task_exector. All inheriting classes are modified similarly.	2023-03-29 15:23:18 +02:00
Aleksandra Martyniuk	f24391fbe4	compaction: move compaction_manager::task out of compaction manager compaction_manager::task needs to be accessed from task manager compaction tasks. Thus, compaction_manager::task and all inheriting classes are moved from compaction manager to compaction namespace.	2023-03-29 15:21:24 +02:00
Wojciech Mitros	cfd2a4588d	wasm: move wasm initialization to query_processor constructor By moving the initialization to the constructor, we can now be certain that all wasm-related objects (wasm instance cache, compilation thread runner, and wasm engine, which was already passed in the constructor) are initialized when we try to use them because we have to use the query processor to access them anyway. The change is also motivated by the fact that we're planning to take Wasm UDFs out of experimental, after which they should stop getting special treatment.	2023-03-29 14:55:36 +02:00
Aleksandra Martyniuk	37cafec9d5	compaction: move sstable_task definition to source file	2023-03-29 14:53:43 +02:00
Botond Dénes	72772d5072	Merge 'auth: replace operator<<(..) with fmt formatter' from Kefu Chai this is a part of a series migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `authenticated_user` without using ostream<<. also, this change removes all existing callers of `operator<<(ostream, const authenticated_user&)`. Refs #13245 Closes #13359 * github.com:scylladb/scylladb: auth: drop operator<<(ostream, authenticated_user) cql3: do not use operator<< to print authenticated_user auth: specialize fmt::formatter<authenticated_user>	2023-03-29 15:24:07 +03:00
Kefu Chai	0b7c345bec	timeout_config: remove unused make_timeout_config() it is replaced by the ctor of updateable_timeout_config, so it does not have any callers now. let's drop it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 20:17:45 +08:00
Kefu Chai	98b9cbbc92	client_state: split the param list of ctor into multi lines it is 215-chars long, so let's breaks it into multiple lines for better readability. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 20:17:45 +08:00
Kefu Chai	ebf5e138e8	redis,thrift,transport: make timeout_config live-updateable * timeout_config - add `updated_timeout_config` which represents an always-updated options backed by `utils::updateable_value<>`. this class is used by servers which need to access the latest timeout related options. the existing `timeout_config` is more like a snapshot of the `updated_timeout_config`. it is used in the use case where we don't need to most updated options or we update the options manually on demand. * redis, thrift, transport: s/timeout_config/updated_timeout_config/ when appropriate. use the improved version of timeout_config where we need to have the access to the most-updated version of the timeout options. Fixes #10172 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 20:17:45 +08:00
Kefu Chai	11cea36c12	docs: dev: write mathematical expressions in LaTeX for better readability Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13341	2023-03-29 15:07:14 +03:00
Kefu Chai	f789d8d3cd	config: mark query timeouts live update-able in this change, following query timeouts config options are marked live update-able: - range_request_timeout_in_ms - read_request_timeout_in_ms - counter_write_request_timeout_in_ms - cas_contention_timeout_in_ms - truncate_request_timeout_in_ms - write_request_timeout_in_ms - request_timeout_in_ms as per https://github.com/scylladb/scylladb/issues/10172, > Many users would like to set the driver timers based on server timers. > For example: expire a read timeout before or after the server read time > out. with this change, these options are marked live-updateable, but since they are cached by their consumers locally, so we will have another commit to update the local copies when these options get updated. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 20:06:02 +08:00
Kefu Chai	1cc28679bc	transport: mark cql_server::timeout_config() const this function returns a const reference to member variable, so we can mark it with the `const` specifier for better readability. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 20:06:02 +08:00
Kefu Chai	ca83dc0101	auth: remove unused forward declaration `timeout_config` is not used by auth/common.hh. presumably, this class is not a public interface exposed by auth, as it is not inherently related auth. timeout_config is a shared setting across related services, specifically, redis_server, thrift and cql_server. so, in this change, let's drop this forward declaration. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 20:06:02 +08:00
Kefu Chai	9a159445f0	redis: drop unused member function now that `redis_server::connection::timeout_config()` and `redis_server::timeout_config()` are used nowhere, let's drop them. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 20:06:02 +08:00
Kefu Chai	d72ab78ffd	transport: drop unused member function since `cql_server::connection::timeout_config()` is used nowhere, let's just drop it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 20:06:02 +08:00
Kefu Chai	fec35b97ad	thrift: keep a reference of timeout_config in handler_factory this change should keep the timeout settings of handler_factory sync'ed with the ones used by `thrift_server`. so far, the `timeout_config` instance in `thrift_server` is not live-updateable, but in a follow-up change, we will make it so. so, this change prepares the handler_factory for a live-updateable timeout_config. instead keeping a snapshot of the timeout_config, keep a reference of it in handler_factory. the reference points to `thrift_server::_config`. so despite that `thrift_server::_handler_factory` is a shared_ptr, the member variable won't outlive its container, as the only reason to have it as a shared_ptr is to appease the ctor of `CassandraAsyncProcessorFactory`. and the constructed `_processor_factory` is also a member variable of `thrift_server`, so we won't take the risk of a dangling reference held by `handler_factory`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 20:06:02 +08:00
Kefu Chai	c642ca9e73	redis,thrift,transport: initialize _config with std::move(config) instead of copying the `config` parameter, move away from it. this change also prepares for a non-copyable config. if the class of `config` is not copyable, we will not be able to initialize the member variable by copying from the given `config` parameter. after the live-updateable config change, the `_config` member variable will contain instances of utils::observer<>, which is not copyable, but is move-constructable, hence in this change, we just move away from the give `config`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 20:06:02 +08:00
Kefu Chai	e0ac2eb770	redis,thrift,transport: pass config via sharded_parameter * pass config via sharded_parameter * initialize config using designated initializer this change paves the road to servers with live-updateable timeout options. before this change, the servers initialize a domain specific combo config, like `redis_server_config`, with the same instance of a timeout_config, and pass the combox config as a ctor parameter to construct each sharded service instance. but this design assumes the value semantic of the config class, say, it should be copyable. but if we want to use utils::updateable_value<> to get updated option values, we would have to postpone the instantiation of the config until the sharded service is about to be initialized. so, in this change, instead of taking a domain specific config created before hand, all services constructed with a `timeout_config` will take a `sharded_parameter()` for creating the config. also, take this opportunity to initialize the config using designated initializer. for two reasons: * less repeatings this way. we don't have to repeat the variable name of the config being initialized for each member variable. * prepare for some member variables which do not have a default constructor. this applies to the timeout_config's updater which will not have a default constructor, as it should be initialized by db::config and a reference to the timeout_config to be updated. we will update the `timeout_config` side in a follow-up commit. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 20:06:00 +08:00
Kefu Chai	99bf8bc0f4	bytes, gms: s/format_to/fmt::format_to/ to disambiguate `fmt::format_to()` from `std::format_to()`. turns out, we have `using namespace std` somewhere in the source tree, and with libstdc++ shipped by GCC-13, we have `std::format_to()`, so without exactly which one to use, compiler complains like ``` /optimized_clang/stage-1-X86/build/bin/clang++ -MD -MT build/dev/mutation/mutation.o -MF build/dev/mutation/mutation.o.d -I/optimized_clang/scylla-X86/seastar/include -I/optimized_clang/scylla-X86/build/dev/seastar/gen/include -U_FORTIFY_SOURCE -DSEASTAR_SSTRING -Werror=unused-result -fstack-clash-protection -DSEASTAR_API_LEVEL=6 -DSEASTAR_BUILD_SHARED_LIBS -DSEASTAR_ENABLE_ALLOC_FAILURE_INJECTION -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_TYPE_ERASE_MORE -DFMT_SHARED -I/usr/include/p11-kit-1 -ffile-prefix-map=/optimized_clang/scylla-X86=. -march=westmere -DDEVEL -DSEASTAR_ENABLE_ALLOC_FAILURE_INJECTION -DSCYLLA_ENABLE_ERROR_INJECTION -O2 -DSCYLLA_BUILD_MODE=dev -iquote. -iquote build/dev/gen --std=gnu++20 -ffile-prefix-map=/optimized_clang/scylla-X86=. -march=westmere -DBOOST_TEST_DYN_LINK -DNOMINMAX -DNOMINMAX -fvisibility=hidden -Wall -Werror -Wno-mismatched-tags -Wno-tautological-compare -Wno-parentheses-equality -Wno-c++11-narrowing -Wno-missing-braces -Wno-ignored-attributes -Wno-overloaded-virtual -Wno-unused-command-line-argument -Wno-unsupported-friend -Wno-delete-non-abstract-non-virtual-dtor -Wno-braced-scalar-init -Wno-implicit-int-float-conversion -Wno-delete-abstract-non-virtual-dtor -Wno-psabi -Wno-narrowing -Wno-nonnull -Wno-uninitialized -Wno-error=deprecated-declarations -DXXH_PRIVATE_API -DSEASTAR_TESTING_MAIN -DFMT_DEPRECATED_OSTREAM -c -o build/dev/mutation/mutation.o mutation/mutation.cc In file included from mutation/mutation.cc:9: In file included from mutation/mutation.hh:13: In file included from mutation/mutation_partition.hh:21: In file included from ./schema/schema_fwd.hh:13: In file included from ./utils/UUID.hh:22: ./bytes.hh:116:21: error: call to 'format_to' is ambiguous format_to(out, "{}{:02x}", _delimiter, std::byte(v[i])); ^~~~~~~~~ ./bytes.hh:134:43: note: in instantiation of function template specialization 'fmt::formatter<fmt_hex>::format<fmt::basic_format_context<fmt::appender, char>>' requested here return fmt::formatter<::fmt_hex>::format(::fmt_hex(bytes_view(s)), ctx); ^ /usr/include/fmt/core.h:813:64: note: in instantiation of function template specialization 'fmt::formatter<seastar::basic_sstring<signed char, unsigned int, 31, false>>::format<fmt::basic_format_context<fmt::appender, char>>' requested here -> decltype(typename Context::template formatter_type<T>().format( ^ /usr/include/fmt/core.h:824:10: note: while substituting deduced template arguments into function template 'has_const_formatter_impl' [with Context = fmt::basic_format_context<fmt::appender, char>, T = seastar::basic_sstring<signed char, unsigned int, 31, false>] return has_const_formatter_impl<Context>(static_cast<T*>(nullptr)); ``` to address this FTBFS, let's be more explicit by adding "fmt::" to specify which `format_to()` to use. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13361	2023-03-29 14:47:28 +03:00
Kefu Chai	ea2badb25f	utils: config_file: add a space after `=` for better readability Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 19:22:21 +08:00
Pavel Emelyanov	a95d3446fd	table: Carry v.u.generator down to do_push_view_replica_updates() The latter is the place where mutate_MV is called and it needs the view updates generator nearby. The call-stack starts at database::do_apply(). As was described in one of the previous patches, applying mutations that need updating views happen late enough, so if the view updates generator is not plugged to the database yet, it's OK to bail out with exception. If it's plugged, it's carried over thus keeping the generator instance alive and waited for on its stop. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-29 14:12:01 +03:00
Pavel Emelyanov	ddc8c8b019	view: Keep v.u.generator shared pointer on view_builder::consumer This is another mutations consumer that pushes view updates forward and thus also needs the view updates generator pointer. It gets one from the view builder that already has the dependency on generator. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-29 14:11:30 +03:00
Pavel Emelyanov	2652dffd89	view: Capture v.u.generator on view_updating_consumer lambda The consumer is in fact pushing the updates and _that_'s the component that would really need the view_update_generator at hand. The consumer is created from the generator itself so no troubles getting the pointer. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-29 14:10:55 +03:00
Pavel Emelyanov	d5557ef0e2	view: Plug view update generator to database The database is low-level service and currently view update generator implicitly depend on it via storage proxy. However, database does need to push view updates with the help of mutate_MV helper, thus adding the dependency loop. This patch exploits the fact that view updates start being pushed late enough, by that time all other service, including proxy and view update generator, seem to be up and running. This allows a "weak dependency" from database to view update generator, like there's one from database to system keyspace already. So in this patch the v.u.g. puts the shared-from-this pointer onto the database at the time it starts. On stop it removes this pointer after database is drained and (hopefully) all view updates are pushed. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-29 14:09:49 +03:00
Pavel Emelyanov	3455b1aed8	view: Add view_builder -> view_update_generator dependency The builder will need generator for view_builder::consumer in one of the next patches. The builder is a standalone service that starts one of the latest and no other services need builder as their dependency. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-29 14:08:47 +03:00
Pavel Emelyanov	3fd12d6a0e	view: Add view_update_generator -> sharded<storage_proxy> dependency The generator will be responsible for spreading view updates with the help of mutate_MV helper. The latter needs storage proxy to operate, so the generator gets this dependency in advance. There's no need to change start/stop order at the moment, generator already starts after and stops before proxy. Also, services that have generator as dependency are not required by proxy (even indirectly) so no circular dependency is produced at this point. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-29 14:08:47 +03:00
Kefu Chai	c307c60d04	scripts: correct a typo in comment s/refreh/refresh/ Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13357	2023-03-29 13:44:47 +03:00
Kefu Chai	55a8b50bbd	release: correct a typo in comment s/to levels of indirection/two levels of indirection/ Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13358	2023-03-29 13:42:38 +03:00
Kefu Chai	dfb55975fc	Update tools/jmx submodule this helps to use OpenJDK 11 instead of OpenJDK 8 for running scylla-jmx, in hope to alleviate the pain of the crashes found in the JRE shipped along with OpenJDK 8, as it is aged, and only security fixes are included now. * tools/jmx 88d9bdc...48e1699 (3): > Merge 'dist/redhat: support jre 11 instead of jre 8' from Kefu Chai > install.sh: point java to /usr/bin/java > Merge 'use OpenJDK 11 instead of OpenJDK 8' from Kefu Chai Refs https://github.com/scylladb/scylla-jmx/issues/194 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13356	2023-03-29 13:00:40 +03:00
Kefu Chai	57f51603dc	compound_compat: remove operator<<(ostream, composite) since we don't have any callers of this operator, let's drop it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 16:13:59 +08:00
Kefu Chai	212641abda	compound_compat: remove operator<<(ostream, composite_view) since we don't have any callers of this operator, let's drop it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 16:13:59 +08:00
Kefu Chai	cdb972222e	sstables: do not use operator<< to print composite_view this change removes the last two callers of `operator<<(ostream&, const composite_view&)`, it paves the road to remove this operator. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 16:13:59 +08:00
Kefu Chai	1ef8f63b4e	compound_compat.hh: specialize fmt::formatter<composite> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `composite` with the help of fmt::ostream. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 16:13:59 +08:00
Kefu Chai	28cabd0a1f	compound_compat.hh: specialize fmt::formatter<composite_view> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `composite::composite_view` with the help of fmt::ostream. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 16:13:59 +08:00
Kefu Chai	15eac8c4cd	compound_compat.hh: specialize fmt::formatter<component_view> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `composite::component_view` with the help of fmt::ostream. in this change, '#' is used to add 0x prefix. as fmtlib allows us to add '0x' prefix using '#' format specifier when printing numbers using 'x' as its type specifier. see https://fmt.dev/latest/syntax.html Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 16:13:10 +08:00
Kefu Chai	5a9b4c02e3	auth: drop operator<<(ostream, authenticated_user) since we don't have any callers of this operator, let's drop it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 16:02:29 +08:00
Kefu Chai	85c89debe6	cql3: do not use operator<< to print authenticated_user this change removes the last two callers of `operator<<(ostream&, const authenticated_user&)`, it paves the road to remove this operator. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 16:02:29 +08:00
Kefu Chai	a7037ae0f4	auth: specialize fmt::formatter<authenticated_user> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `auth::authenticated_user` with the help of fmt::ostream. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-29 16:02:29 +08:00
David Garcia	f45c4983db	docs: update theme 1.4 Closes #13346	2023-03-29 06:56:27 +03:00
Avi Kivity	6977df5539	cql3/selection, forward_service: use use stateless_aggregate_function directly Now that stateless_aggregate_function is directly exposed by aggregate_function, we can use it directly, avoiding the intermediary aggregate_function::aggregate, which is removed.	2023-03-28 23:49:34 +03:00
Avi Kivity	58eb21aa5d	db: functions: fold stateless_aggregate_function_adapter into aggregate_function Now that all aggregate functions are derived from stateless_aggregate_function_adapter, we can just fold its functionality into the base class. This exposes stateless_aggregate_function to all users of aggregate_function, so they can begin to benefit from the transformation, though this patch doesn't touch those users. The aggregate_function base class is partiallly devirtualized since there is just a single implementation now.	2023-03-28 23:47:11 +03:00
Avi Kivity	68529896aa	cql3: functions: simplify accumulator_for template The accumulator_for template is used to select the accumulator type for aggregates. After refactoring, all that is needed from it is to select the native type, so remove all the excess code.	2023-03-28 23:47:11 +03:00
Avi Kivity	4ea3136026	cql3: functions: base user-defined aggregates on stateless aggregates Since the model for stateless aggregates was taken from user defined aggregates, the conversion is trivial.	2023-03-28 23:47:11 +03:00
Avi Kivity	f2715b289a	cql3: functions: drop native_aggregate_function Now that all aggregates are implemented staetelessly, native_aggregate_function no longer has subclasses, so drop it.	2023-03-28 23:47:11 +03:00
Avi Kivity	6bceb25982	cql3: functions: reimplement count(column) statelessly Note that we don't use the automarshalling helper for the aggregation function, since it doesn't work for compound types.	2023-03-28 23:47:11 +03:00
Avi Kivity	4f2cdace9a	cql3: functions: reimplement avg() statelessly	2023-03-28 23:47:11 +03:00
Avi Kivity	b0a8fd3287	cql3: functions: reimplement sum() statelessly	2023-03-28 23:47:11 +03:00
Avi Kivity	d21d11466a	cql3: functions: change wide accumulator type to varint Currently, we use __int128, but this has no direct counterpart in CQL, so we can't express the accumulator type as part of a CQL scalar function. Switch to varint which is a superset, although slower.	2023-03-28 23:47:11 +03:00
Avi Kivity	3252dc0172	cql3: functions: unreverse types for min/max Currently it works without this, but later unreversing will be removed from another part of the stack, causing min/max on reversed types to return incorrect results. Anticipate that an unreverse the types during construction.	2023-03-28 23:47:09 +03:00
Avi Kivity	ed466b7e68	cql3: functions: rename make_{min,max}_dynamic_function There's no longer a statically-typed variant, so no need to distinguish the dynamically-typed one.	2023-03-28 23:37:49 +03:00
Wojciech Mitros	c9b701b516	wasm: return wasm instance cache as a reference instead of a pointer In an incoming change, the wasm instance cache will be modified to be owned by the query_processor - it will hold an optional instead of a raw pointer to the cache, so we should stop returning the raw pointer from the getter as well. Consequently, the cache is also stored as a reference in wasm::cache, as it gets the reference from the query_processor. For consistency with the wasm engine and the wasm alien thread runner, the name of the getter is also modified to follow the same pattern.	2023-03-28 18:18:48 +02:00
Wojciech Mitros	60c99b4c47	wasm: move wasm engine to query_processor The wasm engine is used for compiling and executing Wasm UDFs, so the query_processor is a more appropriate location for it than replica::database, especially because the wasm instance cache and the wasm alien thread runner are already there. This patch also reduces the number of wasm engines to 1, shared by all shards, as recommended by the wasmtime developers.	2023-03-28 17:41:30 +02:00
Calle Wilund	6525209983	alternator/rest api tests: Remove name assumption and rely on actual scylla info Fixes #13332 The tests user the discriminator "system" as prefix to assume keyspaces are marked "internal" inside scylla. This is not true in enterprise universe (replicated key provider). It maybe/probably should, but that train is sailing right now. Fix by removing one assert (not correct) and use actual API info in the alternator test. Closes #13333	2023-03-28 15:41:23 +03:00
Raphael S. Carvalho	989afbf83b	compaction: TWCS: wire up compaction_strategy_state TWCS no longer keeps internal state, and will now rely on state managed by each compaction group through compaction::table_state. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-28 08:48:15 -03:00
Raphael S. Carvalho	233fe6d3dc	compaction: LCS: wire up compaction_strategy_state LCS no longer keeps internal state, and will now rely on state managed by each compaction group through compaction::table_state. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-28 08:48:15 -03:00
Raphael S. Carvalho	2186a75e9b	compaction: Expose compaction_strategy_state through table_state That will allow compaction_strategy to access the compaction group state through compaction::table_state, which is the interface at which replica talks to the compaction layer. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-28 08:48:10 -03:00
Botond Dénes	b6c022a142	Merge 'cmake: sync with `configure.py` (15/n)' from Kefu Chai this is the 15th changeset of a series which tries to give an overhaul to the CMake building system. this series has two goals: - to enable developer to use CMake for building scylla. so they can use tools (CLion for instance) with CMake integration for better developer experience - to enable us to tweak the dependencies in a simpler way. a well-defined cross module / subsystem dependency is a prerequisite for building this project with the C++20 modules. this changeset includes following changes: - build: cmake: add two missing tests - build: cmake: port more cxxflags from configure.py Closes #13262 * github.com:scylladb/scylladb: build: cmake: add missing source files to idl and service build: cmake: port more cxxflags from configure.py build: cmake: add two missing tests	2023-03-28 09:16:38 +03:00
Botond Dénes	88c5b2618c	Merge 'Get rid of global variable "load_prio_keyspaces" (step 1)' from Calle Wilund The concept is needed by enterprise functionality, but in the hunt for globals this sticks out and should be removed. This is also partially prompted by the need to handle the keyspaces in the above set special on shutdown as well as startup. I.e. we need to ensure all user keyspaces are flushed/closed earlier then these. I.e. treat as "system" keyspace for this purpose. These changes adds a "extension internal" keyspace set instead, which for now (until enterprise branches are updated) also included the "load_prio" set. However, it changes distributed loader to use the extension API interface instead, as well as adds shutdown special treatment to replica::database. Closes #13335 * github.com:scylladb/scylladb: datasbase: Flush/close "extension internal" keyspaces after other user ks distributed_loader: Use extensions set of "extension internal" keyspaces db::extentions: Add "extensions internal" keyspace set	2023-03-28 08:35:10 +03:00
Kefu Chai	fcee7f7ac9	reloc: silence warning from readelf we've been seeing errors like ``` 10:39:36 gdb-add-index: [Was there no debuginfo? Was there already an index?] 10:39:36 readelf: /jenkins/workspace/scylla-master/next/scylla/build/dist/debug/redhat/BUILDROOT/scylla-5.3.0~dev-0.20230321.0f97d464d32b.x86_64/usr/lib/debug/opt/scylladb/libreloc/libc.so.6-5.3.0~dev-0.20230321.0f97d464d32b.x86_64.debug: Error: Unable to find program interpreter name ``` when strip.sh is processing *.debug elf images. this is caused by a known issue, see https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1012107 . and this error is not fatal. but it is very distracting when we are trying to find errors in jenkins logging messages. so, in this change, the stderr output from readelf is muted for higher signal-noise ratio in the build logging message. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13267	2023-03-28 08:29:37 +03:00
Anna Stuchlik	4435b8b6f1	doc: elaborate on Scylla admin REST API - V2 This is V2 of https://github.com/scylladb/scylladb/pull/11849 This commit addes more information about ScyllaDB's REST API, including and example for Docker and a screenshot of the Swagger UI. Co-authored-by: Tzach Livyatan <tzach.livyatan@gmail.com> Closes #13331	2023-03-28 08:27:09 +03:00
Botond Dénes	9a024f72c4	Merge 'thrift: return address in listen_addresses() only after server is ready' from Marcin Maliszkiewicz This is used for readiness API: /storage_service/rpc_server and the fix prevents from returning 'true' prematurely. Some improvement for readiness was added in `a51529dd15` but thrift implementation wasn't fully done. Fixes https://github.com/scylladb/scylladb/issues/12376 Closes #13319 * github.com:scylladb/scylladb: thrift: return address in listen_addresses() only after server is ready thrift: simplify do_start_server() with seastar:async	2023-03-28 08:26:16 +03:00
Botond Dénes	60240e6d91	Merge 'bytes, gms: replace operator<<(..) with fmt formatter' from Kefu Chai this is a part of a series migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `bytes` and `gms::inet_address` without using ostream<<. also, this change removes all existing callers of `operator<<(ostream, const bytes &)` and `operator<<(ostream, const gms::inet_address&)`. `gms::inet_address` related changes are included here in hope to demonstrate the usage of delimiter specifier of `fmt_hex` 's formatter. Refs #13245 Closes #13275 * github.com:scylladb/scylladb: gms/inet_address: implement operator<< using fmt::formatter treewide: use fmtlib to format gms::inet_address gms/inet_address: specialize fmt::formatter<gms::inet_address> bytes: implement formatting helpers using formatter bytes: specialize fmt::formatter<bytes> bytes: specialize fmt::formatter<fmt_hex> bytes: mark fmt_hex::v `const`	2023-03-28 08:25:41 +03:00
Botond Dénes	b22f8c6d13	Merge 'Adjust repair module to other task manager modules' conventions' from Aleksandra Martyniuk Files with task manager repair module and related classes are modified to be consistent with task manager compaction module. Closes #13231 * github.com:scylladb/scylladb: repair: rename repair_module repair: add repair namespace to repair/task_manager_module.hh repair: rename repair_task.hh	2023-03-28 08:24:42 +03:00
Raphael S. Carvalho	ee89ff24f2	replica: Add compaction_strategy_state to compaction group The state is not wired anywhere yet. It will replice the ones stored in compaction strategies themselves. Therefore, allowing each compaction group to have its own state. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-27 15:46:14 -03:00
Raphael S. Carvalho	25f73a4181	compaction: Introduce compaction_strategy_state Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-27 15:46:11 -03:00
Raphael S. Carvalho	1ffe2f04ef	compaction: add table_state param to compaction_strategy::notify_completion() once compaction_strategy is made staless, the state must be retrieved in notify_completion() through table_state. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-27 13:40:02 -03:00
Raphael S. Carvalho	2ffaae97a4	compaction: LCS: extract state into a separate struct Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-27 13:40:02 -03:00
Raphael S. Carvalho	e2f38baa92	compaction: TWCS: prepare for stateless strategy Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-27 13:40:01 -03:00
Raphael S. Carvalho	017f432b8f	compaction: TWCS: extract state into a separate struct This is a step towards decoupling compaction strategy (computation) and its state. Making the former stateless. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-27 13:38:47 -03:00
Calle Wilund	7af7c379a5	datasbase: Flush/close "extension internal" keyspaces after other user ks Refs #13334 Effectively treats keyspaces listed in "extension internal" as system keyspaces w.r.t. shutdown/drain. This ensures all user keyspaces are fully flushed before we disable these "internal" ones.	2023-03-27 15:15:49 +00:00
Calle Wilund	c3ec6a76c0	distributed_loader: Use extensions set of "extension internal" keyspaces Refs #13334 Working towards removing load_prio_keyspaces. Use the extensions interface to determine which keyspaces to initialize early.	2023-03-27 15:14:13 +00:00
Calle Wilund	7c8c020c0e	db::extentions: Add "extensions internal" keyspace set Refs #13334 To be populated early by extensions. Such a keyspace should be 1.) Started before user keyspaces 2.) Flushed/closed after user keyspaces 3.) For all other regards be considered "user".	2023-03-27 15:12:31 +00:00
Aleksandra Martyniuk	f10b862955	repair: rename repair_module	2023-03-27 16:33:39 +02:00
Aleksandra Martyniuk	8f935481cd	repair: add repair namespace to repair/task_manager_module.hh	2023-03-27 16:32:51 +02:00
Aleksandra Martyniuk	17e0e05f42	repair: rename repair_task.hh	2023-03-27 16:31:51 +02:00
Raphael S. Carvalho	232e71f2cf	compaction: add const-qualifier to a few compaction_strategy methods Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-27 11:13:10 -03:00
Botond Dénes	c7131a0574	Update tools/cqlsh/ submodule * tools/cqlsh b9a606f...8769c4c (11): > dist: redhat: provide only a single version > pylib/setup, requirement.txt: remove Six > setup: do not support python2 > install.sh: install files with correct permission in struct umask settings > Remove unneed LC_ALL=en_US.UTF-8 > Support using other driver (datastax or older scylla ones) > Fix RPM based downgrade command on scylla-cqlsh > gitignore: ignore pylib/cqlshlib/__pycache__ > dist/redhat: add a proper changelog entry > github actions: enable starting on tags > Add support for building docker image	2023-03-27 16:23:54 +03:00
Kefu Chai	a3cb5db542	gms/inet_address: implement operator<< using fmt::formatter less repeatings this way, Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-27 20:06:45 +08:00
Kefu Chai	8dbaef676d	treewide: use fmtlib to format gms::inet_address the goal of this change is to reduce the dependency on `operator<<(ostream&, const gms::inet_address&)`. this is not an exhaustive search-and-replace change, as in some caller sites we have other dependencies to yet-converted ostream printer, we cannot fix them all, this change only updates some caller of `operator<<(ostream&, const gms::inet_address&)`. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-27 20:06:45 +08:00
Kefu Chai	4ea6e06cac	gms/inet_address: specialize fmt::formatter<gms::inet_address> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `gms::inet_address` with the help of fmt::ostream. please note, the ':' delimiter is specified when printing the IPv6 address. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-27 20:06:45 +08:00
Kefu Chai	a606606ac4	bytes: implement formatting helpers using formatter some of these helpers prints a byte array using `to_hex()`, which materializes a string instance and then drop it on the floor after printing it to the given ostream. this hurts the performance, so `fmt::print()` should be more performant in comparison to the implementations based on `to_hex()`. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-27 20:06:45 +08:00
Kefu Chai	36dc2e3f28	bytes: specialize fmt::formatter<bytes> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print bytes with the help of fmt::ostream. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-27 20:06:45 +08:00
Kefu Chai	2f9dfba800	bytes: specialize fmt::formatter<fmt_hex> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print bytes_view with the help of fmt::ostream. because fmtlib has its own specialization for fmt::formatter<std::basic_string_view<T>>, we cannot just create a full specialization for std::basic_string_view<int8_t>, otherwise fmtlib would complain that > Mixing character types is disallowed. so we workaround this using a delegate of fmt_hex. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-27 20:06:45 +08:00
Tomasz Grabiec	79ee38181c	Merge 'storage_service: wait for normal state handlers earlier in the boot procedure' from Kamil Braun The `wait_for_normal_state_handled_on_boot` function waits until `handle_state_normal` finishes for the given set of nodes. It was used in `run_bootstrap_ops` and `run_replace_ops` to wait until NORMAL states of existing nodes in the cluster are processed by the joining node before continuing the joining process. One reason to do it is because at the end of `handle_state_normal` the joining node might drop connections to the NORMAL nodes in order to reestablish new connections using correct encryption settings. In tests we observed that the connection drop was happening in the middle of repair/streaming, causing repair/streaming to abort. Unfortunately, calling `wait_for_normal_state_handled_on_boot` in `run_bootstrap_ops`/`run_replace_ops` is too late to fix all problems. Before either of these two functions, we create a new CDC generation and write the data to `system_distributed_everywhere.cdc_generation_descriptions_v2`. In tests, the connections were sometimes dropped while this write was in-flight. This would cause the write to never arrive to other nodes, and the joining node would timeout waiting for confirmations. To fix this, call `wait_for_normal_state_handled_on_boot` earlier in the boot procedure, before `make_new_generation` call which does the write. Fixes: #13302 Closes #13317 * github.com:scylladb/scylladb: storage_service: wait for normal state handlers earlier in the boot procedure storage_service: bootstrap: wait for normal tokens to arrive in all cases storage_service: extract get_nodes_to_sync_with helper storage_service: return unordered_set from get_ignore_dead_nodes_for_replace	2023-03-27 13:56:47 +02:00
Kamil Braun	cd282cf0ab	Merge 'Raft, use schema commit log' from Gusev Petr We need this so that we can have multi-partition mutations which are applied atomically. If they live on different shards, we can't guarantee atomic write to the commitlog. Fixes: #12642 Closes #13134 * github.com:scylladb/scylladb: test_raft_upgrade: add a test for schema commit log feature scylla_cluster.py: add start flag to server_add ServerInfo: drop host_id scylla_cluster.py: add config to server_add scylla_cluster.py: add expected_error to server_start scylla_cluster.py: ScyllaServer.start, refactor error reporting scylla_cluster.py: fix ScyllaServer.start, reset cmd if start failed raft: check if schema commitlog is initialized Refuse to boot if neither the schema commitlog feature nor force_schema_commit_log is set. For the upgrade procedure the user should wait until the schema commitlog feature is enabled before enabling consistent_cluster_management. raft: move raft initialization after init_system_keyspace database: rename before_schema_keyspace_init->maybe_init_schema_commitlog raft: use schema commitlog for raft tables init_system_keyspace: refactoring towards explicit load phases	2023-03-27 13:27:30 +02:00
Marcin Maliszkiewicz	339a8fe64d	thrift: return address in listen_addresses() only after server is ready listen_addresses() checks if _server variable is empty and after this patch we assign (move) the value only after server is ready. This is used for readiness API: /storage_service/rpc_server and the fix prevents from returning 'true' prematurely. Some improvement for readiness was added in `a51529dd15` but thrift implementation wasn't fully done. Fixes #12376	2023-03-27 13:20:53 +02:00
Marcin Maliszkiewicz	a38701b9d4	thrift: simplify do_start_server() with seastar:async Code is executed typically on startup only so overhead is very limited. Notably using async avoids managing tserver variable lifetime.	2023-03-27 13:12:10 +02:00
David Garcia	70ce1b2002	docs: Separate conf.py docs: update github actions docs: fix Makefile tabs Update docs-pr.yaml Update Makefile Closes #13323	2023-03-27 13:42:58 +03:00
Botond Dénes	89e58963ab	Update tools/python3/ submodule * tools/python3 279b6c1...d2f57dd (3): > dist: redhat: provide only a single version > SCYLLA-VERSION-GEN: use -gt when comparing values > SCYLLA-VERSION-GEN: remove unnecessary bashism	2023-03-27 12:00:27 +03:00
Botond Dénes	b5afdf56c3	Merge 'Cleanup keyspace compaction task' from Aleksandra Martyniuk Task manager task implementations of classes that cover cleanup keyspace compaction which can be started through /storage_service/keyspace_compaction/ api. Top level task covers the whole compaction and creates child tasks on each shard. Closes #12712 * github.com:scylladb/scylladb: test: extend test_compaction_task.py to test cleanup compaction compaction: create task manager's task for cleanup keyspace compaction on one shard compaction: create task manager's task for cleanup keyspace compaction api: add get_table_ids to get table ids from table infos compaction: create cleanup_compaction_task_impl	2023-03-27 11:52:51 +03:00
Kefu Chai	ed347c5051	bytes: mark fmt_hex::v `const` as fmt_hex is a helper class for formatting the underlying `bytes_view`, it does not mutate it, so mark the member variable const and mark the parameter in its constructor const. this change also helps us to use fmt_hex in the use case where the const semantics is expected. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-27 16:49:07 +08:00
Botond Dénes	ab61704c54	Merge 'mutation: replace operator<<(.., const range_tombstone&) with fmt formatter' from Kefu Chai this is a part of a series migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `range_tombstone` and `range_tombstone_change` without using ostream<<. also, this change removes all existing callers of `operator<<(ostream, const range_tombstone &)` and `operator<<(ostream, const range_tombstone_change &)`, and then removes these two `operator<<`s. Refs #13245 Closes #13260 * github.com:scylladb/scylladb: mutation: drop operator<<(ostream, const range_tombstone{_change,} &) mutation: use fmtlib to print range_stombstone{_change,} mutation: mutation_fragment_v2: specialize fmt::formatter<range_tombstone_change> mutation: range_tombstone: specialize fmt::formatter<range_tombstone>	2023-03-27 11:38:59 +03:00
Botond Dénes	bd42f5ee0b	Merge 'raft: includes used header and use <path/to/header> for include boost headers' from Kefu Chai at least, we need to access the declarations of exceptions, like`not_a_leader` and `dropped_entry`, so, instead of relying on other header to do this job for us, we should include the header which include the declaration. so, in this chance "raft.h" is include explicitly. also, include boost headers using "<path/to/header>` instead of "path/to/header` for more consistency. Closes #13326 * github.com:scylladb/scylladb: raft: include boost header using <path/to/header> not "path/to/header" raft: include used header	2023-03-27 10:11:45 +03:00
Kefu Chai	96ba88f621	dist/debian: add libexec/scylla to source/include-binaries * scripts/create-relocatable-package.py: add a command to print out executables under libexec * dist/debian/debian_files_gen.py: call create-relocatable-package.py for a list of files under libexec and create source/include-binaries with the list. we repackage the precompiled binaries in the relocatable package into a debian source package using `./scylla/install.sh`, which edits the executable to use the specified dynamic library loader. but dpkg-source does not like this, as it wants to ensure that the files in original tarball (*.orig.tar.gz) is identical to the files in the source package created by dpkg-source. so we have following failure when running reloc/build_deb.sh ``` dpkg-source: error: cannot represent change to scylla/libexec/scylla: binary file contents changed dpkg-source: error: add scylla/libexec/scylla in debian/source/include-binaries if you want to store the modified binary in the debian tarball dpkg-source: error: unrepresentable changes to source dpkg-buildpackage: error: dpkg-source -b . subprocess returned exit status 1 debuild: fatal error at line 1182: dpkg-buildpackage -rfakeroot -us -uc -ui failed ``` in this change, to address the build failure, as proposed by dpkg, the path to the patched/edited executable is added to `debian/source/include-binaries`. see the "Building" section in https://manpages.debian.org/bullseye/dpkg-dev/dpkg-source.1.en.html for more details. please search `adjust_bin()` in `scylladb/install.sh` for more details. Signed-off-by: Takuya ASADA <syuu@scylladb.com> Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12722	2023-03-27 10:10:12 +03:00
Botond Dénes	4b5b6a9010	test/lib: rm test_table.hh No users left.	2023-03-27 02:00:44 -04:00
Botond Dénes	3a43574b39	test/boos/multishard_mutation_query_test: migrate other tests to random schema Create a local method called create_test_table that has the same signature as test::create_test_table, but uses random schema behind the scenes to generate the schema and the data, then migrate all the test cases to use it instead. To accomodate to the added randomness added by the random schema and random data, the unreliable querier cache population checks was replaced with more reliable lookup and miss checks, to prevent test flakiness. Querier cache population checks worked well with a fixed and simple schema and a fixed table population, they don't work that well with random data. With this, there are no more uses of test_table.hh in this test and the include can be removed.	2023-03-27 02:00:44 -04:00
Botond Dénes	56a9968817	test/boost/multishard_mutation_query_test: use ks keyspace This keyspace exists by default and thus we don't have to create a new one for each test. Also use `get_name()` to pass the test case's name as table name, instead of hard-coding it. We already had some copy-pasta creep in: two tests used the same table name. This is an error, as each test runs in its own env, but it is confusing to see another test case's name in the logs.	2023-03-27 02:00:44 -04:00
Botond Dénes	ad313d8eef	test/boost/multishard_mutation_query_test: improve test pager Propagate the page size to the result builder, so it can determine when a page is short and thus it is the last page, instead of asking for more pages until an empty one turns up. This will make tests more reliable when dealing with random datasets. Also change how the page counter is bumped: bump it after the current page is executed, at which point we know whether there will be a next page or not. This fixes an off-by-one seen in some cases.	2023-03-27 02:00:44 -04:00
Botond Dénes	3df70a9f3b	test/boost/multishard_mutation_query_test: refactor fuzzy_test Use the random_schema and its facilities to generate the schema and the dataset. This allows the test to provide a much better coverage then the previous, fixed and simplistic schema did. Also reduce the test table population and the number of scans ran on it to the test runs in a more reasonable time-frame. We run these tests all the time due to CI, so no need to try to do too much in a single run.	2023-03-27 02:00:43 -04:00
Botond Dénes	2cdda562f7	test/boost: add multishard_mutation_query_test more memory The tests in this file work with random schema and random data. Some seeds can generate large partitions and rows, give the test some more headroom to work with.	2023-03-27 01:44:00 -04:00
Botond Dénes	00f06522c2	types/user: add get_name() accessor For the raw name (bytes).	2023-03-27 01:44:00 -04:00
Botond Dénes	99c9a71d93	test/lib/random_schema: add create_with_cql() Allowing the generated schema to be created as a CQL table, so that queries can be run against it.	2023-03-27 01:44:00 -04:00
Botond Dénes	10a44fee06	test/lib/random_schema: fix udt handling * generate lowercase names (upper-case seems to cause problems); * preserve dependency order between UDTs when dumping them from schema; * use built-in describe() to dump to CQL string; * drop single arg dump_udts() overlad, which was not recursive, unlike the vector variant;	2023-03-27 01:44:00 -04:00
Botond Dénes	b2ddc60c10	test/lib/random_schema: type_generator(): also generate frozen types For regular and static columns, to introduce some further randomness. So far frozen types were generated only for primary key members and embedded types.	2023-03-27 01:44:00 -04:00
Botond Dénes	1cb4b1fc83	test/lib/random_schema: type_generator(): make static column generation conditional On the schema having clustering columns. Otherwise static column is illegal.	2023-03-27 01:44:00 -04:00
Botond Dénes	2a7cccd1a8	test/lib/random_schema: type_generator(): don't generate duration_type for keys And for any embedded type (collection, tuple members, etc.). Its not allowed as I recently learned it.	2023-03-27 01:44:00 -04:00
Botond Dénes	c9f54e539d	test/lib/random_schema: generate_random_mutations(): add overload with seed	2023-03-27 01:44:00 -04:00
Botond Dénes	394909869d	test/lib/random_schema: generate_random_mutations(): respect range tombstone count param Even though there is a parameter determining the number of range tombstones to be generated, the method disregards it and generates just 4. Fix that.	2023-03-27 01:43:59 -04:00
Botond Dénes	477b26f7af	test/lib/random_schema: generate_random_mutations(): add yields	2023-03-27 01:43:59 -04:00
Botond Dénes	fd8a50035a	test/lib/random_schema: generate_random_mutations(): fix indentation	2023-03-27 01:43:59 -04:00
Botond Dénes	71fdec7b42	test/lib/random_schema: generate_random_mutations(): coroutinize method	2023-03-27 01:43:59 -04:00
Botond Dénes	393aaddff0	test/lib/random_schema: generate_random_mutations(): expand comment Add note about mutation order and deduplication.	2023-03-27 01:43:59 -04:00
Avi Kivity	cd0b167d6c	Merge 'bloom_filter: cleanups' from Kefu Chai this series applies some random cleanups to bloom_filter. these cleanups were the side products when the author was working on #13314 . Closes #13315 * github.com:scylladb/scylladb: bloom_filter: mark internal help function static bloom_filter: add more constness to false positive rate tables bloom_filter: use vector::back() when appropriate	2023-03-26 19:43:37 +03:00
Kefu Chai	33f4012eeb	test: cql-pytest: test_describe: clamp bloom filter's fp rate before this change, we use `round(random.random(), 5)` for the value of `bloom_filter_fp_chance` config option. there are chances that this expression could return a number lower or equal to 6.71e-05. but we do have a minimal for this option, which is defined by `utils::bloom_calculations::probs`. and the minimal false positive rate is 6.71e-05. we are observing test failures where the we are using 0 for the option, and scylla right rejected it with the error message of ``` bloom_filter_fp_chance must be larger than 6.71e-05 and less than or equal to 1.0 (got 0) ```. so, in this change, to address the test failure, we always use a number slightly greater or equal to a number slightly greater to the minimum to ensure that the randomly picked number is in the range of supported false positive rate. Fixes #13313 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13314	2023-03-26 19:41:22 +03:00
Botond Dénes	d5488dba69	reader_permit: set_trace_state(): emit trace message linking to previous page This method is called on the start of each page, updating the trace state stored on the permit to that of the current page. When doing so, emit a trace message, containing the session id of the previous page, so the per-page sessions can be stiched together later. Note that this message is only emitted if the cached read survived between the pages. Example: Tracing session: dcfc1570-ca3c-11ed-88d0-24443f03a8bb activity \| timestamp \| source \| source_elapsed \| client ---------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-----------+----------------+----------- Execute CQL3 query \| 2023-03-24 08:10:27.271000 \| 127.0.0.1 \| 0 \| 127.0.0.1 Parsing a statement [shard 0] \| 2023-03-24 08:10:27.271864 \| 127.0.0.1 \| -- \| 127.0.0.1 Processing a statement [shard 0] \| 2023-03-24 08:10:27.271958 \| 127.0.0.1 \| 94 \| 127.0.0.1 Creating read executor for token 3274692326281147944 with all: {127.0.0.1} targets: {127.0.0.1} repair decision: NONE [shard 0] \| 2023-03-24 08:10:27.271995 \| 127.0.0.1 \| 132 \| 127.0.0.1 read_data: querying locally [shard 0] \| 2023-03-24 08:10:27.271998 \| 127.0.0.1 \| 135 \| 127.0.0.1 Start querying singular range {{3274692326281147944, pk{00026b73}}} [shard 0] \| 2023-03-24 08:10:27.272003 \| 127.0.0.1 \| 140 \| 127.0.0.1 [reader concurrency semaphore] admitted immediately [shard 0] \| 2023-03-24 08:10:27.272006 \| 127.0.0.1 \| 143 \| 127.0.0.1 [reader concurrency semaphore] executing read [shard 0] \| 2023-03-24 08:10:27.272014 \| 127.0.0.1 \| 150 \| 127.0.0.1 Querying cache for range {{3274692326281147944, pk{00026b73}}} and slice {(-inf, +inf)} [shard 0] \| 2023-03-24 08:10:27.272022 \| 127.0.0.1 \| 159 \| 127.0.0.1 Page stats: 1 partition(s), 0 static row(s) (0 live, 0 dead), 3 clustering row(s) (3 live, 0 dead) and 0 range tombstone(s) [shard 0] \| 2023-03-24 08:10:27.272076 \| 127.0.0.1 \| 212 \| 127.0.0.1 Caching querier with key ab928e0d-b815-46b7-9a02-1fa2d9549477 [shard 0] \| 2023-03-24 08:10:27.272084 \| 127.0.0.1 \| 221 \| 127.0.0.1 Querying is done [shard 0] \| 2023-03-24 08:10:27.272087 \| 127.0.0.1 \| 224 \| 127.0.0.1 Done processing - preparing a result [shard 0] \| 2023-03-24 08:10:27.272106 \| 127.0.0.1 \| 242 \| 127.0.0.1 Request complete \| 2023-03-24 08:10:27.271259 \| 127.0.0.1 \| 259 \| 127.0.0.1 Tracing session: dd3092f0-ca3c-11ed-88d0-24443f03a8bb activity \| timestamp \| source \| source_elapsed \| client ---------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-----------+----------------+----------- Execute CQL3 query \| 2023-03-24 08:10:27.615000 \| 127.0.0.1 \| 0 \| 127.0.0.1 Parsing a statement [shard 0] \| 2023-03-24 08:10:27.615223 \| 127.0.0.1 \| -- \| 127.0.0.1 Processing a statement [shard 0] \| 2023-03-24 08:10:27.615310 \| 127.0.0.1 \| 87 \| 127.0.0.1 Creating read executor for token 3274692326281147944 with all: {127.0.0.1} targets: {127.0.0.1} repair decision: NONE [shard 0] \| 2023-03-24 08:10:27.615346 \| 127.0.0.1 \| 124 \| 127.0.0.1 read_data: querying locally [shard 0] \| 2023-03-24 08:10:27.615349 \| 127.0.0.1 \| 126 \| 127.0.0.1 Start querying singular range {{3274692326281147944, pk{00026b73}}} [shard 0] \| 2023-03-24 08:10:27.615352 \| 127.0.0.1 \| 130 \| 127.0.0.1 Found cached querier for key ab928e0d-b815-46b7-9a02-1fa2d9549477 and range(s) {{{3274692326281147944, pk{00026b73}}}} [shard 0] \| 2023-03-24 08:10:27.615358 \| 127.0.0.1 \| 135 \| 127.0.0.1 Reusing querier [shard 0] \| 2023-03-24 08:10:27.615362 \| 127.0.0.1 \| 139 \| 127.0.0.1 Continuing paged query, previous page's trace session is dcfc1570-ca3c-11ed-88d0-24443f03a8bb [shard 0] \| 2023-03-24 08:10:27.615364 \| 127.0.0.1 \| 141 \| 127.0.0.1 [reader concurrency semaphore] executing read [shard 0] \| 2023-03-24 08:10:27.615371 \| 127.0.0.1 \| 148 \| 127.0.0.1 Page stats: 1 partition(s), 0 static row(s) (0 live, 0 dead), 1 clustering row(s) (1 live, 0 dead) and 0 range tombstone(s) [shard 0] \| 2023-03-24 08:10:27.615385 \| 127.0.0.1 \| 163 \| 127.0.0.1 Querying is done [shard 0] \| 2023-03-24 08:10:27.615583 \| 127.0.0.1 \| 360 \| 127.0.0.1 Done processing - preparing a result [shard 0] \| 2023-03-24 08:10:27.615730 \| 127.0.0.1 \| 507 \| 127.0.0.1 Request complete \| 2023-03-24 08:10:27.615518 \| 127.0.0.1 \| 518 \| 127.0.0.1 See the message: Continuing paged query, previous page's trace session is dcfc1570-ca3c-11ed-88d0-24443f03a8bb [shard 0] \| 2023-03-24 08:10:27.615364 \| 127.0.0.1 \| 141 \| 127.0.0.1 This is a folow-up to #13255 Refs: #12781 Closes #13318	2023-03-26 18:41:21 +03:00
Avi Kivity	f937fad25a	Merge 'readers/multishard: shard_reader: fast-forward created reader to current range' from Botond Dénes When creating the reader, the lifecycle policy might return one that was saved on the last page and survived in the cache. This reader might have skipped some fast-forwarding ranges while sitting in the cache. To avoid using a reader reading a stale range (from the read's POV), check its read range and fast forward it if necessary. Fixes: https://github.com/scylladb/scylladb/issues/12916 Closes #12932 * github.com:scylladb/scylladb: readers/multishard: shard_reader: fast-forward created reader to current range readers/multishard: reader_lifecycle_policy: add get_read_range() test/boost/multishard_mutation_query_test: paging: handle range becoming wrapping	2023-03-26 18:39:50 +03:00
Wojciech Mitros	f0aa540e00	cql: renice the wasm compilation alien thread The Wasm compilation is a slow, low priority task, so it should not compete with reactor threads or the networking core. To achieve that, we increase the niceness of the thread by 10. An alternative solution would be to set the priority using pthread_setschedparam, but it's not currently feasible, because as long as we're using the SCHED_OTHER policy for our threads, we cannot select any other priority than 0. Closes #13307	2023-03-26 18:38:23 +03:00
Anna Stuchlik	1cfea1f13c	doc: remove incorrect info about BYPASS CACHE Fixes https://github.com/scylladb/scylladb/issues/13106 This commit removes the information that BYPASS CACHE is an Enterprise-only feature and replaces that info with the link to the BYPASS CACHE description. Closes #13316	2023-03-26 18:13:17 +03:00
Kefu Chai	e796525f23	types: remove unused header <iterator> was introduced back in `1cf02cb9d8`, but lexicographical_compare.hh was extracted out in `bdfc0aa748`, since we don't have any users of <iterator> in types.hh anymore, let's remove it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13327	2023-03-26 16:55:16 +03:00
Avi Kivity	eeff8cd075	Merge 'dist/redhat: enforce dependency on %{release} also' from Kefu Chai s/%{version}/%{version}-%{release}/ in `Requires:` sections. this enforces the runtime dependencies of exactly the same releases between scylla packages. Fixes #13222 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13229 * github.com:scylladb/scylladb: dist/redhat: split Requires section into multiple lines dist/redhat: enforce dependency on %{release} also	2023-03-26 16:50:10 +03:00
Avi Kivity	bfd70c192e	cql3: functions: reimplement min/max statelessly min() and max() had two implementations: one static (for each type in a select list) and one dynamic (for compound types). Since the dynamic implementation is sufficient, we only reimplement that. This means we don't use the automarshalling helpers, since we don't do any arithemetic on values apart from comparison, which is conveniently provided by abstract_type.	2023-03-26 15:18:22 +03:00
Avi Kivity	e6342d476b	cql3: functions: reimplement count(*) statelessly Note we have to explicitly decay lambdas to functions using unary operator +.	2023-03-26 15:18:22 +03:00
Avi Kivity	9291ec5ed1	cql3: functions: simplify creating native functions even more Add a helper function to consolidate the internal native function class and the automatic marshalling introduced in previous patches. Since decaying a lambda into a function pointer (in order to infer its signature) there are two overloads: one accepts a lambda and decays it into a function pointer, the second accepts a function pointer, infers its argument, and constructs the function object.	2023-03-26 15:15:36 +03:00
Kefu Chai	3425184b2a	raft: include boost header using <path/to/header> not "path/to/header" for more consistency with the rest of the source tree. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-26 14:07:50 +08:00
Kefu Chai	0421d6d12f	raft: include used header at least, we need to access the declarations of exceptions, like `not_a_leader` and `dropped_entry`, so, instead of relying on other header to do this job for us, we should include the header which include the declaration. so, in this chance "raft.h" is include explicitly. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-26 14:07:50 +08:00
Kefu Chai	023e985a6c	build: cmake: add missing source files to idl and service they were added recently, but cmake failed to sync with configure.py. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-26 14:01:21 +08:00
Kefu Chai	e0ca80d21f	build: cmake: port more cxxflags from configure.py Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-26 14:01:21 +08:00
Kefu Chai	a5547ea11b	build: cmake: add two missing tests they are leftovers in `f113dac5bf` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-26 14:01:21 +08:00
Tzach Livyatan	46e6c639d9	docs: minor improvments to the Raft Handling Failures and recovery procedure sections Closes #13292	2023-03-24 18:17:36 +01:00
Botond Dénes	b6682ad607	docs/operating-scylla/admin-tools: scylla-sstable.rst: update schema section With the recent changes to the ways schema can be provided to the tool.	2023-03-24 11:41:40 -04:00
Botond Dénes	bc9341b84a	test/cql-pytest: test_tools.py: add test for schema loading A comprehensive test covering all the supported ways of providing the schema to scylla-sstable, either explicitely or implicitely (auto-detect).	2023-03-24 11:41:40 -04:00
Botond Dénes	afdfe34ca7	test/cql-pytest: nodetool.py: add flush_keyspace() It would have been better if `flush()` could have been called with a keyspace and optional table param, but changing it now is too much churn, so we add a dedicated method to flush a keyspace instead.	2023-03-24 11:41:40 -04:00
Botond Dénes	1f0ab699c3	tools/scylla-sstable: reform schema loading mechanism So far, schema had to be provided via a schema.cql file, a file which contains the CQL definition of the table. This is flexible but annoying at the same time. Many times sstables the tool operates on are located in their table directory in a scylla data directory, where the schema tables are also available. To mitigate this, an alternative method to load the schema from memory was added which works for system tables. In this commit we extend this to work for all kind of tables: by auto-detecting where the scylla data directory is, and loading the schema tables from disk.	2023-03-24 11:41:40 -04:00
Botond Dénes	c5b2fc2502	tools/schema_loader: add load_schema_from_schema_tables() Allows loading the schema for the designated keyspace and table, from the system table sstables located on disk. The sstable files opened for read only.	2023-03-24 11:41:40 -04:00
Botond Dénes	19560419d2	Merge 'treewide: improve compatibility with gcc 13' from Avi Kivity An assortment of patches that reduce our incompatibilities with the upcoming gcc 13. Closes #13243 * github.com:scylladb/scylladb: transport: correctly format unknown opcode treewide: catch by reference test: raft: avoid confusing string compare utils, types, test: extract lexicographical compare utilities test: raft: fsm_test: disambiguate raft::configuration construction test: reader_concurrency_semaphore_test: handle all enum values repair: fix signed/unsigned compare repair: fix incorrect signed/unsigned compare treewide: avoid unused variables in if statements keys: disambiguate construction from initializer_list<bytes> cql3: expr: fix serialize_listlike() reference-to-temporary with gcc compaction: error on invalid scrub type treewide: prevent redefining names api: task_manager: fix signed/unsigned compare alternator: streams: fix signed/unsigned comparison test: fix some mismatched signed/unsigned comparisons	2023-03-24 15:16:05 +02:00
Botond Dénes	132d101dc7	db/schema_tables: expose types schema	2023-03-24 08:50:39 -04:00
Botond Dénes	14bff955e2	readers/multishard: shard_reader: fast-forward created reader to current range When creating the reader, the lifecycle policy might return one that was saved on the last page and survived in the cache. This reader might have skipped some fast-forwarding ranges while sitting in the cache. To avoid using a reader reading a stale range (from the read's POV), check its read range and fast forward it if necessary.	2023-03-24 08:43:03 -04:00
Botond Dénes	0aa03f85a3	readers/multishard: reader_lifecycle_policy: add get_read_range() Allows retrieving the current read-range for the reader on the given shard (where the method is called).	2023-03-24 08:40:11 -04:00
Botond Dénes	1c7a66cd2a	test/boost/multishard_mutation_query_test: paging: handle range becoming wrapping After each page, the read range is adjusted so it continues from/after the last read partition. Sometimes this can result in the range becoming wrapped like this: (pk, pk]. In this case, we can just drop this range and continue with the rest of the ranges (if there are multiple ones).	2023-03-24 08:40:11 -04:00
Tomasz Grabiec	c54a3d9c10	Merge 'Clean enabled features manipulations in system keyspace' from Pavel Emelyanov There was an attempt to cut feature-service -> system-keyspace dependency (#13172) which turned out to require more changes. Here's a preparation squeezing from this future work. This set - leaves only batch-enabling API in feature service - keeps the need for async context in feature service - narrows down system keyspace features API to only load and store records - relaxes features updating logic in sys.ks. - cosmetic Closes #13264 * github.com:scylladb/scylladb: feature_service: Indentation fix after previous patch feature_service: Move async context into enable() system_keyspace: Refactor local features load/save helpers feature_service: Mark supported_feature_set() const feature_service: Remove single feature enabling method boot: Enable features in batch gossiper: Enable features in batch	2023-03-24 13:12:49 +01:00
Petr Gusev	c1634ea5fa	test_raft_upgrade: add a test for schema commit log feature The test tries to start a node with consistent_cluster_management but without force_schema_commit_log. This is expected to fail, since the schema commitlog feature should be enabled by all the cluster nodes.	2023-03-24 16:08:17 +04:00
Petr Gusev	e407956e9f	scylla_cluster.py: add start flag to server_add Sometimes when creating a node it's useful to just install it and not start. For example, we may want to try to start it later with expected error. The ScyllaServer.install method has been made exception safe, if an exception occurs, it reverts to the original state. This allows to not duplicate the try/except logic in two of its call sites.	2023-03-24 16:08:17 +04:00
Petr Gusev	794d0e4000	ServerInfo: drop host_id We are going to allow the ScyllaCluster.add_server function not to start the server if the caller has requested that with a special parameter. The host_id can only be obtained from a running node, so add_server won't be able to return it in this case. I've grepped the tests for host_id and there doesn't seem to be any reference to it in the code.	2023-03-24 16:08:17 +04:00
Petr Gusev	8e3392c64f	scylla_cluster.py: add config to server_add Sometimes when creating a node it's useful to pass a custom node config.	2023-03-24 16:08:17 +04:00
Petr Gusev	c1d0ee2bce	scylla_cluster.py: add expected_error to server_start Sometimes it's useful to check that the node has failed to start for a particular reason. If server_start can't find expected_error in the node's log or if the node has started without errors, it throws an exception.	2023-03-24 16:08:11 +04:00
Petr Gusev	a4411e9ec4	scylla_cluster.py: ScyllaServer.start, refactor error reporting Extract the function that encapsulates all the error reporting logic. We are going to use it in several other places to implement expected_error feature.	2023-03-24 15:54:52 +04:00
Petr Gusev	21b505e67c	scylla_cluster.py: fix ScyllaServer.start, reset cmd if start failed The ScyllaServer expects cmd to be None if the Scylla process is not running. Otherwise, if start failed and the test called update_config, the latter will try to send a signal to a non-existent process via cmd.	2023-03-24 15:54:52 +04:00
Petr Gusev	75a4ff2da9	raft: check if schema commitlog is initialized Refuse to boot if neither the schema commitlog feature nor force_schema_commit_log is set. For the upgrade procedure the user should wait until the schema commitlog feature is enabled before enabling consistent_cluster_management.	2023-03-24 15:54:52 +04:00
Petr Gusev	d8997a4993	raft: move raft initialization after init_system_keyspace Raft tables are loaded on the second call to init_system_keyspace, so it seems more logical to move initialization after it. This is not necessary right now since raft tables are not used in this initialization logic, but it may change in the future and cause troubles.	2023-03-24 15:54:52 +04:00
Petr Gusev	769732d095	database: rename before_schema_keyspace_init->maybe_init_schema_commitlog We are going to move the raft tables from the first load phase to the second. This means the second init_system_keyspace call will load raft tables along with the schema, making the name of this function imprecise.	2023-03-24 15:54:52 +04:00
Petr Gusev	273e70e1f9	raft: use schema commitlog for raft tables Fixes: #12642	2023-03-24 15:54:52 +04:00
Petr Gusev	5a5d664a5a	init_system_keyspace: refactoring towards explicit load phases We aim (#12642) to use the schema commit log for raft tables. Now they are loaded at the first call to init_system_keyspace in main.cc, but the schema commitlog is only initialized shortly before the second call. This is important, since the schema commitlog initialization (database::before_schema_keyspace_init) needs to access schema commitlog feature, which is loaded from system.scylla_local and therefore is only available after the first init_system_keyspace call. So the idea is to defer the loading of the raft tables until the second call to init_system_keyspace, just as it works for schema tables. For this we need a tool to mark which tables should be loaded in the first or second phase. To do this, in this patch we introduce system_table_load_phase enum. It's set in the schema_static_props for schema tables. It replaces the system_keyspace::table_selector in the signature of init_system_keyspace. The call site for populate_keyspace in init_system_keyspace was changed, table_selector.contains_keyspace was replaced with db.local().has_keyspace. This check prevents calling populate_keyspace(system_schema) on phase1, but allows for populate_keyspace(system) on phase2 (to init raft tables). On this second call some tables from system keyspace (e.g. system.local) may have already been populated on phase1. This check protects from double-populating them, since every populated cf is marked as ready_for_writes.	2023-03-24 15:54:46 +04:00
Anna Stuchlik	9e27f6b4b7	doc: update the Ubuntu version used in the image Starting from 5.2 and 2023.1 our images are based on Ubuntu:22.04. See https://github.com/scylladb/scylladb/issues/13138#issuecomment-1467737084 This commit adds that information to the docs. It should be merged and backported to branch-5.2. Closes #13301	2023-03-24 13:50:51 +02:00
Kamil Braun	0b19a614fa	storage_service: wait for normal state handlers earlier in the boot procedure The `wait_for_normal_state_handled_on_boot` function waits until `handle_state_normal` finishes for the given set of nodes. It was used in `run_bootstrap_ops` and `run_replace_ops` to wait until NORMAL states of existing nodes in the cluster are processed by the joining node before continuing the joining process. One reason to do it is because at the end of `handle_state_normal` the joining node might drop connections to the NORMAL nodes in order to reestablish new connections using correct encryption settings. In tests we observed that the connection drop was happening in the middle of repair/streaming, causing repair/streaming to abort. Unfortunately, calling `wait_for_normal_state_handled_on_boot` in `run_bootstrap_ops`/`run_replace_ops` is too late to fix all problems. Before either of these two functions, we create a new CDC generation and write the data to `system_distributed_everywhere.cdc_generation_descriptions_v2`. In tests, the connections were sometimes dropped while this write was in-flight. This would cause the write to never arrive to other nodes, and the joining node would timeout waiting for confirmations. To fix this, call `wait_for_normal_state_handled_on_boot` earlier in the boot procedure, before `make_new_generation` call which does the write. Fixes: #13302	2023-03-24 12:45:07 +01:00
Kamil Braun	451389970b	storage_service: bootstrap: wait for normal tokens to arrive in all cases `storage_service::bootstrap` waits until it receives normal tokens of other nodes before proceeding or it times out with an error. But it only did that for bootstrap operation, not for replace operation. Do it for replace as well.	2023-03-24 12:44:37 +01:00
Kamil Braun	c003b7017d	storage_service: extract get_nodes_to_sync_with helper	2023-03-24 12:44:37 +01:00
Kamil Braun	599393dcba	storage_service: return unordered_set from get_ignore_dead_nodes_for_replace	2023-03-24 12:44:37 +01:00
Anna Stuchlik	73b74e8cac	doc: remove Enterprise upgrade guides from OSS doc This commit removes the Enterprise upgrade guides from the Open Source documentation. The Enterprise upgrade guides should only be available in the Enterprise documentation, with the source files stored in scylla-enterprise.git. In addition, this commit: - adds the links to the Enterprise user guides in the Enterprise documentation at https://enterprise.docs.scylladb.com/ - adds the redirections for the removed pages to avoid breaking any links. This commit must be reverted in scylla-enterprise.git. Closes #13298	2023-03-24 10:57:03 +02:00
Kefu Chai	a7b4f84b6a	bloom_filter: mark internal help function static as `initialize_opt_k()` is not used out side of the translation unit, let's mark it `static`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-24 15:41:45 +08:00
Kefu Chai	1a82a7ac72	bloom_filter: add more constness to false positive rate tables we never mutate them, so mark them const for better readability. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-24 15:41:45 +08:00
Kefu Chai	7f4a3fdac8	bloom_filter: use vector::back() when appropriate no need to use `size - 1` for accessing the last element in a vector, let's just use `vector::back()` for more compacted code. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-24 15:41:45 +08:00
Jan Ciolek	a1c86786ca	db/view/view.cc: rate limit view update error messages When propagating a view update to a paired view replica fails, there is an error message. This message is printed for every mutation, which causes log spam when some node goes down. This isn't a fatal error - it's normal that a remote view replica goes down, it'll hopefully receive the updates later through hints. I'm unsure if the error message should be printed at all, but for now we can just rate limit it and that will improve the situation with log spamming. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com> Closes #13175	2023-03-24 08:59:39 +02:00
Pavel Emelyanov	b0a5769d92	validation: Avoid throwing schema lookup The validate_column_family() tries to find a schema and throws if it doesn't exist. The latter is determined by the exception thrown by the database::find_schema(), but there's a throw-less way of doing it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13295	2023-03-24 08:43:48 +02:00
Kamil Braun	e8fb718e4a	Merge 'topology changes over raft' from Gleb Natapov The patch series introduces linearisable topology changes using raft protocol. The state machine driven by raft is described in "service: Introduce topology state machine". Some explanations about the implementation can be found in "storage_service: raft topology: implement topology management through raft". The code is not ready for production. There is not much in terms of error handling and integration with the rest of the system is not even started. For full integration request fencing will need to be implemented and token_metadata has to be extended to support not just "pending" nodes but concepts of "read replica set" and "write replica set". The code may be far from be usable, but it is hidden behind the "experimental raft" flag and having it in tree will relieve me from constant rebase burden. * 'raft-topology-v6' of github.com:scylladb/scylla-dev: storage_service: fix indentation from previous patch storage_service: raft topology: implement topology management through raft service: raft: make group0_guard move assignable service: raft: wire up apply() and snapshot transfer for topology in group0 state machine storage_service: raft topology: introduce a function that applies topology cmd to local state machine storage_service: raft topology: introduce a raft monitor and topology coordinator fibers storage_service: raft topology: introduce snapshot transfer code for the topology table raft topology: add RAFT_TOPOLOGY_CMD verb that will be used by topology coordinator to communicated with nodes bootstrapper: Add get_random_bootstrap_tokens function service: raft: add support for topology_change command into raft_group0_client service: raft: introduce topology_change group0 command system_keyspace: add a table to persist topology change state machine's state service: Introduce topology state machine data structures storage_proxy: not consult topology on local table write	2023-03-23 15:59:45 +01:00
Gleb Natapov	5a908c3f46	storage_service: fix indentation from previous patch	2023-03-23 16:29:56 +02:00
Gleb Natapov	f3bd7e9b8c	storage_service: raft topology: implement topology management through raft The code here implements the state machine described in "service: Introduce topology state machine". A topology operation is requested by writing into topology_request field through raft. After that topology_change_transition() function running on a leader is responsible to drive the operation to completion. There is no much in terms of error handling here yet. It something fails the code will just continue trying. topology_change_state_load() which is (eventually) called on all nodes each time state machine's state changes is a glue between the raft view of the topology and the rest of the "legacy" system. The code there creates token_metadata object from the raft view and fills in peers table which is needed for drivers. The gossiper is almost completely cut of from the topology management, but the code still updates node's sate there to 'normal' and 'left' for some legacy functionality to continue working. Note that handlers for those states are disabled in raft mode. raft_topology_cmd_handler() is called by topology coordinator and this is where the streaming happens. The kind of streaming depends on the state the node is in. The function is "re-entrable". It can be called more then once and will either start new operation if it is the first invocation or previous one failed, or it will wait from previous operation to complete. The new code is hidden behind "experimental raft" and should not change how the system works if disabled. Some indentation here is intentionally left wrong and will be fixed by the next patch.	2023-03-23 16:29:56 +02:00
Gleb Natapov	8865d5cf13	service: raft: make group0_guard move assignable	2023-03-23 16:29:56 +02:00
Gleb Natapov	344b483425	service: raft: wire up apply() and snapshot transfer for topology in group0 state machine	2023-03-23 16:29:56 +02:00
Gleb Natapov	aca21d3318	storage_service: raft topology: introduce a function that applies topology cmd to local state machine The function applies to persistent storage and call stub function topology_change_state_load() that will load the new state into the memory in later patches.	2023-03-23 16:29:56 +02:00
Gleb Natapov	284afd9255	storage_service: raft topology: introduce a raft monitor and topology coordinator fibers Raft monitor fiber monitors local's server raft state and starts the topology coordinator fiber when it becomes a leader. Stops it when it is not longer a leader. The coordinator fiber waits for topology state changes, but there will be none yet.	2023-03-23 16:29:56 +02:00
Gleb Natapov	d69a887366	storage_service: raft topology: introduce snapshot transfer code for the topology table	2023-03-23 16:29:56 +02:00
Gleb Natapov	6a4d773b7e	raft topology: add RAFT_TOPOLOGY_CMD verb that will be used by topology coordinator to communicated with nodes Empty for now. Will be used later by the topology coordinator to communicate with other nodes to instruct them to start streaming, or start to fence read/writes.	2023-03-23 16:29:56 +02:00
Nadav Har'El	4fdcee8415	test/alternator: increase CQL connection timeout This patch increases the connection timeout in the get_cql_cluster() function in test/cql-pytest/run.py. This function is used to test that Scylla came up, and also test/alternator/run uses it to set up the authentication - which can only be done through CQL. The Python driver has 2-second and 5-second default timeouts that should have been more than enough for everybody (TM), but in #13239 we saw that in one case it apparently wasn't enough. So to be extra safe, let's increase the default connection-related timeouts to 60 seconds. Note this change only affects the Scylla boot in the test/*/run scripts, and it does not affect the actual tests - those have different code to connect to Scylla (see cql_session() in test/cql-pytest/util.py), and we already increased the timeouts there in #11289. Fixes #13239 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13291	2023-03-23 16:03:20 +02:00
Avi Kivity	afe6b0d8c9	Merge 'reader_concurrency_semaphore: add trace points for important events' from Botond Dénes Currently we have no visibility into what happens to a read in the reader concurrency semaphore as far as tracing is concerned. This series fixes that, storing a trace state pointer on the reader permit and using it to add trace messages to important semaphore related events: * admission decision * execution (execution stage functionality) * eviction This allows for seeing if the read suffered any delay in the semaphore. Example tracing (2 pages): ``` Tracing session: 8cc80d50-c72d-11ed-8427-14e21cc3ed56 activity \| timestamp \| source \| source_elapsed \| client -------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-----------+----------------+----------- Execute CQL3 query \| 2023-03-20 10:43:16.773000 \| 127.0.0.1 \| 0 \| 127.0.0.1 Parsing a statement [shard 0] \| 2023-03-20 10:43:16.773754 \| 127.0.0.1 \| -- \| 127.0.0.1 Processing a statement [shard 0] \| 2023-03-20 10:43:16.773837 \| 127.0.0.1 \| 83 \| 127.0.0.1 Creating read executor for token -4911109968640856406 with all: {127.0.0.1} targets: {127.0.0.1} repair decision: NONE [shard 0] \| 2023-03-20 10:43:16.773874 \| 127.0.0.1 \| 121 \| 127.0.0.1 read_data: querying locally [shard 0] \| 2023-03-20 10:43:16.773877 \| 127.0.0.1 \| 123 \| 127.0.0.1 Start querying singular range {{-4911109968640856406, pk{000d73797374656d5f736368656d61}}} [shard 0] \| 2023-03-20 10:43:16.773881 \| 127.0.0.1 \| 128 \| 127.0.0.1 [reader concurrency semaphore] admitted immediately [shard 0] \| 2023-03-20 10:43:16.773884 \| 127.0.0.1 \| 130 \| 127.0.0.1 [reader concurrency semaphore] executing read [shard 0] \| 2023-03-20 10:43:16.773890 \| 127.0.0.1 \| 137 \| 127.0.0.1 Querying cache for range {{-4911109968640856406, pk{000d73797374656d5f736368656d61}}} and slice {(-inf, +inf)} [shard 0] \| 2023-03-20 10:43:16.773903 \| 127.0.0.1 \| 149 \| 127.0.0.1 Page stats: 1 partition(s), 0 static row(s) (0 live, 0 dead), 100 clustering row(s) (100 live, 0 dead) and 0 range tombstone(s) [shard 0] \| 2023-03-20 10:43:16.774674 \| 127.0.0.1 \| 920 \| 127.0.0.1 Caching querier with key 5eff94d2-e47a-43b2-8e3a-2d80a9cc3b3e [shard 0] \| 2023-03-20 10:43:16.774685 \| 127.0.0.1 \| 931 \| 127.0.0.1 Querying is done [shard 0] \| 2023-03-20 10:43:16.774688 \| 127.0.0.1 \| 934 \| 127.0.0.1 Done processing - preparing a result [shard 0] \| 2023-03-20 10:43:16.774706 \| 127.0.0.1 \| 953 \| 127.0.0.1 Request complete \| 2023-03-20 10:43:16.774225 \| 127.0.0.1 \| 1225 \| 127.0.0.1 Tracing session: 8d26f630-c72d-11ed-8427-14e21cc3ed56 activity \| timestamp \| source \| source_elapsed \| client ---------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-----------+----------------+----------- Execute CQL3 query \| 2023-03-20 10:43:17.395000 \| 127.0.0.1 \| 0 \| 127.0.0.1 Parsing a statement [shard 0] \| 2023-03-20 10:43:17.395498 \| 127.0.0.1 \| -- \| 127.0.0.1 Processing a statement [shard 0] \| 2023-03-20 10:43:17.395558 \| 127.0.0.1 \| 60 \| 127.0.0.1 Creating read executor for token -4911109968640856406 with all: {127.0.0.1} targets: {127.0.0.1} repair decision: NONE [shard 0] \| 2023-03-20 10:43:17.395597 \| 127.0.0.1 \| 99 \| 127.0.0.1 read_data: querying locally [shard 0] \| 2023-03-20 10:43:17.395600 \| 127.0.0.1 \| 102 \| 127.0.0.1 Start querying singular range {{-4911109968640856406, pk{000d73797374656d5f736368656d61}}} [shard 0] \| 2023-03-20 10:43:17.395604 \| 127.0.0.1 \| 106 \| 127.0.0.1 Found cached querier for key 5eff94d2-e47a-43b2-8e3a-2d80a9cc3b3e and range(s) {{{-4911109968640856406, pk{000d73797374656d5f736368656d61}}}} [shard 0] \| 2023-03-20 10:43:17.395610 \| 127.0.0.1 \| 112 \| 127.0.0.1 Reusing querier [shard 0] \| 2023-03-20 10:43:17.395614 \| 127.0.0.1 \| 116 \| 127.0.0.1 [reader concurrency semaphore] executing read [shard 0] \| 2023-03-20 10:43:17.395622 \| 127.0.0.1 \| 125 \| 127.0.0.1 Page stats: 1 partition(s), 0 static row(s) (0 live, 0 dead), 11 clustering row(s) (11 live, 0 dead) and 0 range tombstone(s) [shard 0] \| 2023-03-20 10:43:17.395711 \| 127.0.0.1 \| 213 \| 127.0.0.1 Querying is done [shard 0] \| 2023-03-20 10:43:17.395718 \| 127.0.0.1 \| 221 \| 127.0.0.1 Done processing - preparing a result [shard 0] \| 2023-03-20 10:43:17.395734 \| 127.0.0.1 \| 236 \| 127.0.0.1 Request complete \| 2023-03-20 10:43:17.395276 \| 127.0.0.1 \| 276 \| 127.0.0.1 ``` Fixes: https://github.com/scylladb/scylladb/issues/12781 Closes #13255 * github.com:scylladb/scylladb: reader_concurrency_semaphore: add trace points for important events reader_permit: refresh trace_state on new pages reader_permit: keep trace_state pointer on permit test/perf/perf_collection: give more unique names to key comparators	2023-03-23 15:37:33 +02:00
Botond Dénes	7699904c54	Revert "repair: Reduce repair reader eviction with diff shard count" This reverts commit `c6087cf3a0`. Said commit can cause a deadlock when 2 or more repairs compete for locks on 2 or more nodes. Consider the following scenario: Node n1 and n2 in the cluster, 1 shard per node, rf = 2, each shard has 1 available unit for the reader lock n1 starts repair r1 r1-n1 (instance of r1 on node1) takes the reader lock on node1 n2 starts repair r2 r2-n2 (instance of r2 on node2) takes the reader lock on node2 r1-n2 will fail to take the reader lock on node2 r2-n1 will fail to take the reader lock on node1 As a result, r1 and r2 could not make progress and deadlock happens. The complexity comes from the fact that a repair job needs lock on more than one node. It is not guaranteed that all the participant nodes could take the lock in one short. There is no simple solution to this so we have to revert this locking mechanism and look for another way to prevent reader trashing when repairing nodes with mismatching shard count. Fixes: #12693 Closes #13266	2023-03-23 15:35:32 +02:00
Nadav Har'El	b5e61e1b83	test/cql-pytest, lwt: test for detection of contradicting batches Cassandra detects when a batch has both an IF EXISTS and IF NOT EXISTS on the same row, and complains this is not a useful request (after all, it can never succeed, because the batch can only succeed if both conditions are true, and that can't be if one checks IF EXISTS and the other IF NOT EXISTS). This patch adds a test, test_lwt_with_batch_conflict_1, which checks that this case results in an error. It passes on Cassandra, but xfails on Scylla which doesn't report an error in this case. A second test, test_lwt_with_batch_conflict_2, shows that the detection of the EXISTS / NOT EXISTS conflict is special, and other conflicts such as having both "r=1" and "r=2" for the same row, are NOT detected by Cassandra. Refs #13011. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13270	2023-03-23 13:35:21 +02:00
Pavel Emelyanov	b13ff5248c	sstables: Mark continuous_data_consumer::reader_position() const Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13285	2023-03-23 13:27:33 +02:00
Pavel Emelyanov	bee5593ba1	storage_service: Move node_ops_meta_data to .cc file It's declared in header, but is not used outside of .cc. Forward declaration in header would be enough. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13289	2023-03-23 13:22:39 +02:00
Tzach Livyatan	ea66c16818	Fix Enable Authorization doc page references a wrong CL used by a 'cassandra' user Fix https://github.com/scylladb/scylladb/issues/11633 Closes #11637	2023-03-23 13:20:36 +02:00
Kefu Chai	0421a82821	sstables: add type constraits right in parameter list for better readability. also, add `#include <concepts>`, as we should include what we use instead of relying on other headers do this on behalf of us. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13277	2023-03-23 13:57:22 +03:00
Anna Stuchlik	b54868c639	doc: disable the outdated banner This commit disables the banner that advertises ScyllaDB University Live event, which aleardy took place. Closes #13284	2023-03-23 08:57:45 +02:00
Kefu Chai	1197664f09	test: network_topology_strategy_test: silence warning clang warns when the implicit conversion changes the precision of the converted number. in this case, the before being multiplied, `std::numeric_limits<unsigned long>::max() >> 1` is implicitly promoted to double so it can obtain the common type of double and unsigned long. and the compiler warns: ``` /home/kefu/dev/scylladb/test/boost/network_topology_strategy_test.cc:129:84: error: implicit conversion from 'unsigned long' to 'double' changes value from 9223372036854775807 to 9223372036854775808 [-Werror,-Wimplicit-const-int-float-conversion] return static_cast<unsigned long>(d(std::numeric_limits<unsigned long>::max() >> 1)) << 1; ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~ ``` but 1. we don't really care about the precision here, we just want to map a double to a token represented by an int64_t 2. the maximum possible number being converted is less than 9223372036854775807, which is the maximum number of int64_t, which is in general an alias of `long long`, not to mention that LONG_MAX is always 2147483647, after shifting right, the result would be 1073741823 so this is a false alarm. in order to silence it, we explicitly cast the RHS of `` operator to double. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13221	2023-03-23 08:55:29 +02:00
Botond Dénes	aee5dfaa84	Merge 'docs: Add card logos' from David Garcia Related issue https://github.com/scylladb/scylladb/issues/13119 Adds product logos to cards Preview: ![Welcome-to-ScyllaDB-Documentation-ScyllaDB-Docs (1)](https://user-images.githubusercontent.com/9107969/224996621-6c93676d-1427-4a28-a529-fd3cd2bc2d61.png) Closes #13167 * github.com:scylladb/scylladb: docs: Update custom styles docs: Update styles docs: Add card logos	2023-03-23 08:53:58 +02:00
Botond Dénes	0f5e845399	Merge 'docs: scylladb better php driver' from Daniel Reis Hey y'all! Me and @malusev998 are maintaining a updated version of the [PHP Driver ](https://github.com/he4rt/scylladb-php-driver) together with @he4rt community and it had a bunch of improvements on these last month. Before it was working only at PHP 7.1 (DataStax branch), and at our branch we have it working at PHP 8.1 and 8.2. We are also using the ScyllaDB C++ Driver on this project and I think that is a good idea to point new users for this project since it's the most updated PHP Driver maintained now. What do y'all think about that? Closes #13218 * github.com:scylladb/scylladb: fix: links to php driver fix: adding php versions into driver's description docs: scylladb better php driver	2023-03-23 08:53:30 +02:00
Tzach Livyatan	2d40952737	DOCS: remove invalid example from DML reference, WHERE clause section Closes #12596	2023-03-22 18:37:20 +02:00
Nadav Har'El	d1e6d9103a	Merge 'api: reference httpd::* symbols like 'httpd::'' from Kefu Chai this change is a leftover of `063b3be8a7`, which failed to include the changes in the header files. it turns out we have `using namespace httpd;` in seastar's `request_parser.rl`, and we should not rely on this statement to expose the symbols in `seatar::httpd` to `seastar` namespace. in this change, also, sine `get_name()` previously a non-static member function of `seastar_test` is now a static member function, so we need to update the tests which capture `this` for calling this function, so they don't capture `this` anymore. Closes #13202 github.com:scylladb/scylladb: test: drop unused captured variables Update seastar submodule	2023-03-22 18:16:15 +02:00
Kefu Chai	596ea6d439	test: drop unused captured variables this should silence the warning like: ``` test/boost/multishard_mutation_query_test.cc:493:29: error: lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture] do_with_cql_env_thread([this] (cql_test_env& env) -> future<> { ^~~~ test/boost/multishard_mutation_query_test.cc:577:29: error: lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture] do_with_cql_env_thread([this] (cql_test_env& env) -> future<> { ^~~~ 2 errors generated. ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-22 21:21:04 +08:00
Avi Kivity	4a18ee87eb	Update seastar submodule * seastar 9cbc1fe889...1204efbc5e (14): > http: Add lost pragma once into client.hh > prometheus, http: do not expose httpd::* in seastar > build: add haswell support > ci: fix configuration to build checkheaders target. > core: map_reduce: Fix use-after-free in variant with futurized reducer > Merge 'tests: support boost::test decorators and tolerate failures in test_spawn_input' from Kefu Chai > memory: support reallocing foreign (non-Seastar) memory on a reactor thread > test: futures: disable -Wself-move for GCC>=13 > map_reduce: do not move a temporary object > doc/building-dpdk.md: drop extraneous '$' > http: url_decode: translate plus back into char > Merge 'seastar-json2code: cleanups' from Kefu Chai > Fix markdown formatting > Merge 'Minor abort on OOM changes' from Travis Downs	2023-03-22 21:21:04 +08:00
Benny Halevy	c09d0f6694	everywhere: use sstables::generation_type Use generation_type rather than generation_type::int_t where possible and removed the deprecated functions accepting the int_t.i Ref #10459 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-22 13:59:47 +02:00
Benny Halevy	b597f41b8c	test: sstable_test_env: use make_new_generation Also, add a bunch of make_sstable variants that get a generation_type param for this. With that, the entry points for generation_type::int_t are deprecated and their users will be converted in following patches. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-22 13:58:59 +02:00
Benny Halevy	a0e43af576	sstable_directory::components_lister::process: fixup indentation	2023-03-22 13:58:43 +02:00
Benny Halevy	a8dc2fda29	sstables: make highest_generation_seen return optional generation It is possible to find no generation in an empty table directory, and in he future, with uuid generations it'd be possible to find no numeric generations in the directory. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-22 13:55:23 +02:00
Benny Halevy	ba680a7b96	replica: table: add make_new_generation function make_new_generation generates a new generation from an optional one. If disengaged, it just generates a new generation based on the shard_id. Otherwise, it generates the next generation in sequence by adding smp::count to the previous value, like we do today. In the future, with uuid-based generations, the function could be used to generate a new random uuid based on the optional parameter. It will be up to the caller, e.g. replica::table or sstables manager to decide which kind of generation to create. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-22 13:52:22 +02:00
Benny Halevy	b28eacce6f	replica: table: move sstable generation related functions out of line updating the highest generation happens only during startup and creating sstables is done rarely enough there is no reason to inline either functions. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-22 13:49:18 +02:00
Benny Halevy	d4d480a374	test: sstables: use generation_type::int_t Convert all users to use sstables::generation_type::int_t. Further patches will continue to convert most to using sstables::generation_type instead so we can abstract the value type. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-22 13:48:50 +02:00
Benny Halevy	30cc0beb47	sstables: generation_type: define int_t So it can be used everywhere to prepare for uuid sstable generation support. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-22 13:36:52 +02:00
Vlad Zolotarov	f94bbc5b34	transport: add per-scheduling-group CQL opcode-specific metrics This patch extends a previous patch that added these metrics globally: - cql_requests_count - cql_request_bytes - cql_response_bytes This patch adds a "scheduling_group_name" label to these metrics and changes corresponding counters to be accounted on a per-scheduling-group level. As a bonus this patch also marks all 3 metrics as 'skip_when_empty'. Ref #13061 Signed-off-by: Vlad Zolotarov <vladz@scylladb.com> Message-Id: <20230321201412.3004845-1-vladz@scylladb.com>	2023-03-22 13:27:48 +02:00
Botond Dénes	ff87f95a26	reader_concurrency_semaphore: add trace points for important events Notably, to admission execution and eviction. Registering/unregistering the permit as inactive is not traced, as this happens on every buffer-fill for range scans. Semaphore trace messages have a "[reader_concurrency_semaphore]" prefix to allow them to be clearly associated with the semaphore.	2023-03-22 04:58:18 -04:00
Botond Dénes	1f51f752cc	reader_permit: refresh trace_state on new pages To make sure all tracing done on a certain page will make its way into the appropriate trace session. This is a contination of the previous patch (which added trace pointer to the permit).	2023-03-22 04:58:10 -04:00
Botond Dénes	156e5d346d	reader_permit: keep trace_state pointer on permit And propagate it down to where it is created. This will be used to add trace points for semaphore related events, but this will come in the next patches.	2023-03-22 04:58:01 -04:00
Botond Dénes	27a4c24522	test/perf/perf_collection: give more unique names to key comparators perf.cc has two key comparators: key_compare and key_tri_compare. These are very generic name, in fact key_compare directly clashes with a comparator with the same name in types.hh. Avoid the clash by renaming both of these to a more unique name.	2023-03-22 04:58:01 -04:00
Nadav Har'El	2038388268	cql-pytest: translate Cassandra's tests for multi-column relations This is a translation of Cassandra's CQL unit test source file validation/operations/SelectMultiColumnRelationTest.java into our cql-pytest framework. The tests reproduce four already-known Scylla bugs and three new bugs. All tests pass on Cassandra. Because of these bugs 9 of the 22 tests are marked xfail, and one is marked skip (it crashes Scylla). Already known issues: Refs #64: CQL Multi column restrictions are allowed only on a clustering key prefix Refs #4178: Not covered corner case for key prefix optimization in filtering Refs #4244: Add support for mixing token, multi- and single-column restrictions Refs #8627: Cleanly reject updates with indexed values where value > 64k New issue discovered by these tests: Refs #13217: Internal server error when null is used in multi-column relation Refs #13241: Multi-column IN restriction with tuples of different lengths crashes Scylla Refs #13250: One-element multi-column restriction should be handled like a single-column restriction Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13265	2023-03-22 09:54:32 +02:00
Tzach Livyatan	083408723f	doc: Add Mumur term to the glossery Point to the difference between the official MurmurHash3 and Scylla / Cassandra implementation Update docs/glossary.rst Co-authored-by: Anna Stuchlik <37244380+annastuchlik@users.noreply.github.com> Closes #11369	2023-03-21 22:45:47 +02:00
Alejo Sanchez	da00052ad8	gms, service: replicate live endpoints on shard 0 Call replicate_live_endpoints on shard 0 to copy from 0 to the rest of the shards. And get the list of live members from shard 0. Move lock to the callers. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Closes #13240	2023-03-21 15:46:12 +01:00
Gleb Natapov	fd6d45e178	bootstrapper: Add get_random_bootstrap_tokens function Does the same as get_bootstrap_tokens() but does not consult initial token config option. Will be used later.	2023-03-21 16:06:43 +02:00
Gleb Natapov	fc84c69b7e	service: raft: add support for topology_change command into raft_group0_client Extend raft_group0_client::prepare_command with support of topology_change type of command.	2023-03-21 16:06:43 +02:00
Gleb Natapov	16d61e791f	service: raft: introduce topology_change group0 command Also extend group0_command to be able to send new command type. The command consists of a mutation array.	2023-03-21 16:06:43 +02:00
Gleb Natapov	5e232ebee5	system_keyspace: add a table to persist topology change state machine's state Add local table to store topology change state machine's state there. Also add a function that loads the state to memory.	2023-03-21 16:06:43 +02:00
Gleb Natapov	a2b7d2c1a1	service: Introduce topology state machine data structures The topology state machine will track all the nodes in a cluster, their state, properties (topology, tokens, etc) and requested actions. Node state can be one of those: none - the node is not yet in the cluster bootstrapping - the node is currently bootstrapping decommissioning - the node is being decommissioned removing - the node is being removed replacing - the node is replacing another node normal - the node is working normally rebuild - the node is being rebuilt left - the node is left the cluster Nodes in state left are never removed from the state. Tokens also can be in one of the states: write_both_read_old - writes are going to new and old replica, but reads are from old replicas still write_both_read_new - writes still going to old and new replicas but reads are from new replica owner - tokens are owned by the node and reads and write go to new replica set only Tokens that needs to be move start in 'write_both_read_old' state. After entire cluster learns about it streaming start. After the streaming tokens move to 'write_both_read_new' state and again the whole cluster needs to learn about it and make sure no reads started before that point exist in the system. After that tokens may move to 'owner' state. topology_request is the field through which a topology operation request can be issued to a node. A request is one of the topology operation currently supported: join, leave, replace or remove.	2023-03-21 16:06:43 +02:00
Gleb Natapov	dd1e27736e	storage_proxy: not consult topology on local table write Writes to tables with local replication strategies do not need to consult the topology. This is not only an optimization but it allows writing into the local tables before topology is known.	2023-03-21 16:06:43 +02:00
Anna Stuchlik	922f6ba3dd	doc: fix the service name in upgrade guides Fixes https://github.com/scylladb/scylladb/issues/13207 This commit fixes the service and package names in the upgrade guides 5.0-to-2022.1 and 5.1-to-2022.2. Service name: scylla-server Package name: scylla-enterprise Previous PRs to fix the same issue in other upgrade guides: https://github.com/scylladb/scylladb/pull/12679 https://github.com/scylladb/scylladb/pull/12698 This commit must be backported to branch-5.1 and branch 5.2. Closes #13225	2023-03-21 15:56:28 +02:00
Kefu Chai	124410c059	api: reference httpd::* symbols like 'httpd::' this change is a leftover of `063b3be`, which failed to include the changes in the header files. it turns out we have `using namespace httpd;` in seastar's `request_parser.rl`, and we should not rely on this statement to expose the symbols in `seatar::httpd` to `seastar` namespace. in this change, api/.hh: all httpd symbols are referenced by `httpd::` instead of being referenced as if they are in `seastar`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-21 15:49:10 +02:00
Avi Kivity	19810cfc5e	transport: correctly format unknown opcode gcc allows an enum to contain values outside its members. For extra safety, as this can be user visible, format the unknown opcode and return it.	2023-03-21 15:43:00 +02:00
Avi Kivity	e75009cd49	treewide: catch by reference gcc rightly warns about capturing by value, so capture by reference.	2023-03-21 15:43:00 +02:00
Avi Kivity	eaad38c682	test: raft: avoid confusing string compare gcc doesn't like comparing a C string to an sstring -- apparently it has different promotion rules than clang. Fix by doing an explicit conversion.	2023-03-21 15:43:00 +02:00
Avi Kivity	bdfc0aa748	utils, types, test: extract lexicographical compare utilities UUID_test uses lexicograhical_compare from the types module. This is a layering violation, since UUIDs are at a much lower level than the database type system. In practical terms, this cause link failures with gcc due to some thread-local-storage variables defined in types.hh but not provided by any object, since we don't link with types.o in this test. Fix by extracting the relevant functions into a new header.	2023-03-21 15:42:53 +02:00
Avi Kivity	32a724fada	test: raft: fsm_test: disambiguate raft::configuration construction gcc thinks the constructor call is ambiguous since "{}" can match the default constructor. Fix by making the parameter type explicit. Use "{}" for the constructor call to avoid the most-vexing-parse problem.	2023-03-21 13:45:57 +02:00
Avi Kivity	83e149c341	test: reader_concurrency_semaphore_test: handle all enum values gcc considers values outside the enum class enumeration lists to be valid, so handle them. In this case, we don't think they can happen, so abort.	2023-03-21 13:45:57 +02:00
Avi Kivity	bc0bba10b4	repair: fix signed/unsigned compare Fix the loop induction variable to have the same type as the termination value.	2023-03-21 13:45:49 +02:00
Avi Kivity	94a10ed6ab	repair: fix incorrect signed/unsigned compare A signed/unsigned compare can overflow. Fix by using the safer std::cmp_greater(). The problem is minor as the user is unlikely to send a negative id.	2023-03-21 13:45:34 +02:00
Avi Kivity	a806024e1d	treewide: avoid unused variables in if statements gcc warns about unused variables declared in if statements. Just drop them.	2023-03-21 13:42:49 +02:00
Avi Kivity	9ced89a41c	keys: disambiguate construction from initializer_list<bytes> Some tests initialize via an initializer_list, but gcc finds other valid constructors via vector<managed_bytes>. Disambiguate by adding a constructor that accepts the initializer_list, and forward to the wanted constructor.	2023-03-21 13:42:49 +02:00
Avi Kivity	41a2856f78	cql3: expr: fix serialize_listlike() reference-to-temporary with gcc serialize_listlike() is called with a range of either managed_bytes or managed_bytes_opt. If the former, then iterating and assigning to a loop induction variable of type managed_byted_opt& will bind the reference to a temporary managed_bytes_opt, which gcc dislikes. Fix by performing the binding in a separate statement, which allows for lifetime extension.	2023-03-21 13:42:49 +02:00
Avi Kivity	32cc975b2f	compaction: error on invalid scrub type gcc allows an enum to contain a value outside its enum set, so we need to handle it. Since it shouldn't happen, signal an internal error.	2023-03-21 13:42:49 +02:00
Avi Kivity	7bb717d2f9	treewide: prevent redefining names gcc dislikes a member name that matches a type name, as it changes the type name retroactively. Fix by fully-qualifying the type name, so it is not changed by the newly-introduced member.	2023-03-21 13:42:49 +02:00
Avi Kivity	7ab65379b9	api: task_manager: fix signed/unsigned compare Trivial fix by changing the type of the induction variable.	2023-03-21 13:42:42 +02:00
Avi Kivity	429650e508	alternator: streams: fix signed/unsigned comparison We compare a signed variable to an unsigned one, which can yield surprising results. In this case, it is harmless since we already validated the signed input is positive, but use std::cmp_less() to quench any doubts (and warnings).	2023-03-21 13:41:53 +02:00
Nadav Har'El	77bf90bf7d	Merge 'Sanitize {format_types\|version_types} to/from string converters' from Pavel Emelyanov There's a need to convert both -- version and format -- to string and back. Currently, there's a disperse set of helpers in sstables/ code doing that and this PR brings some other to it - adds fmt::formatter<> specialization for both types - leaves one set of {format\|version}_from_string() helpers converting any string-ish object into value refs: #12523 Closes #13214 * github.com:scylladb/scylladb: sstables: Expell sstable_version_types from_string() helper sstables: Generalize ..._from_string helpers sstables: Implement fmt::formatter<sstable_format_types> sstables: Implement fmt::formatter<sstable_version_types> sstables: Move format maps to namespace scope	2023-03-21 13:39:24 +02:00
Avi Kivity	0770b328c7	test: fix some mismatched signed/unsigned comparisons gcc likes to complain about sized/unsigned compares as they can yield surprising results. The fixes are trivial, so apply them.	2023-03-21 13:15:12 +02:00
Pavel Emelyanov	970fc80ea6	feature_service: Indentation fix after previous patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-21 11:59:37 +03:00
Pavel Emelyanov	8600cb2db0	feature_service: Move async context into enable() Callers don't need to know that enabling features has this requirement Indentation is deliberately left broken (until next patch) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-21 11:59:34 +03:00
Pavel Emelyanov	ae6e29a919	system_keyspace: Refactor local features load/save helpers Introduce load_local_enabled_features() and save_local_enabled_features() that get and put std::set<sstring> with feature names (and perform set to string and back conversions on their own). They look natural next to existing sys.ks. methods to get/set local-supported features and peer features. Using the new API, the more generic functions to preserve individual features and load them on startup can become much shorter and cleaner. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-21 11:54:02 +03:00
Wojciech Mitros	406ea34aba	build: add wasm compilation target for rust In the future, when testing WASM UDFs, we will only store the Rust source codes of them, and compile them to WASM. To be able to do that, we need rust standard library for the wasm32-wasi target, which is available as an RPM called rust-std-static-wasm32-wasi. Closes #12896 [avi: regenerate toolchain] Closes #13258	2023-03-21 10:30:08 +02:00
Pavel Emelyanov	6a5ab87441	feature_service: Mark supported_feature_set() const It's indeed such Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-21 11:12:29 +03:00
Pavel Emelyanov	985fbf703a	feature_service: Remove single feature enabling method No longer used Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-21 11:12:28 +03:00
Pavel Emelyanov	b27d2c9399	boot: Enable features in batch On boot main calls enable_features_on_startup() which at the end scans through the list of features and enables them. Same as in previous patch -- it makes sense to use batch enabling here. Note, that despite the loop that collects features is not as trivial as in previous patch (gossiper case), it still operates with local copies of feature sets so delaying the feature's enabling doesn't affect other features' need to be enabled too. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-21 11:12:25 +03:00
Pavel Emelyanov	256dd9d7e3	gossiper: Enable features in batch Gossiper code walks the list of feature names and enables them one-by-one. However, in the feature_service code there's a method that enables features in batch. Using it now doesn't make any difference, but next patches will make some use of it. Also, this will let shortening feature_service's API and will make it simpler to remove qctx thing from there. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-21 11:12:16 +03:00
Pavel Emelyanov	fe7609865d	Merge 'reader_concurrency_semaphore: improve diagnostics printout' from Botond Dénes Remove redundant "Total: ..." line. Include the entire `reader_concurrency_semaphore::stats` in the printout. This includes a lot of metrics not exported to monitoring. These metrics are very valuable when debugging timeouts but are otherwise uninteresting. To avoid bloating our monitoring with such niche metrics, we dump them when they are interesting: when timeouts happen. To be really helpful, we do need historic values too, but this shouldn't be a problem: timeouts come in bursts, we usually get at least a handful of diagnostics dumps at a time. New stats are also added to record the reason why reads are queued on the semaphore. Printout before: ``` INFO 2023-03-14 12:43:54,496 [shard 0] reader_concurrency_semaphore - Semaphore test_reader_concurrency_semaphore_memory_limit_no_leaks with 4/4 count and 7168/4096 memory resources: kill limit triggered, dumping permit diagnostics: permits count memory table/description/state 4 4 7K ./reader/active/unused 2 0 0B ./reader/waiting_for_admission 6 4 7K total Total: 6 permits with 4 count and 7K memory resources ``` Printout after: ``` INFO 2023-03-16 04:23:41,791 [shard 0] reader_concurrency_semaphore - Semaphore test_reader_concurrency_semaphore_memory_limit_no_leaks with 3/4 count and 7168/4096 memory resources: kill limit triggered, dumping permit diagnostics: permits count memory table/description/state 2 2 6K ./reader/active/unused 1 1 1K ./reader/waiting_for_memory 2 0 0B ./reader/waiting_for_admission 5 3 7K total Stats: permit_based_evictions: 0 time_based_evictions: 0 inactive_reads: 0 total_successful_reads: 0 total_failed_reads: 0 total_reads_shed_due_to_overload: 0 total_reads_killed_due_to_kill_limit: 1 reads_admitted: 4 reads_enqueued_for_admission: 4 reads_enqueued_for_memory: 5 reads_admitted_immediately: 2 reads_queued_because_ready_list: 0 reads_queued_because_used_permits: 0 reads_queued_because_memory_resources: 0 reads_queued_because_count_resources: 4 reads_queued_with_eviction: 0 total_permits: 6 current_permits: 5 used_permits: 0 blocked_permits: 0 disk_reads: 0 sstables_read: 0 ``` Closes #13173 * github.com:scylladb/scylladb: test/boost/reader_concurrency_semaphore_test: remove redundant stats printouts reader_concurrency_semaphore: do_dump_reader_permit_diagnostics(): print the stats reader_concurrency_semaphore: add stats to record reason for queueing permits reader_concurrency_semaphore: can_admit_read(): also return reason for rejection	2023-03-21 10:41:11 +03:00
Pavel Emelyanov	eecb9244dd	sstables: Expell sstable_version_types from_string() helper It's name is too generic despite it's narrow specialization. Also, there's a version_from_string() method that does the same in a more convenient way. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-21 09:56:18 +03:00
Pavel Emelyanov	4e99637777	sstables: Generalize ..._from_string helpers There are two string->{version\|format} converters living on class sstable. It's better to have both in namespace scope. Surprisingly, there's only one caller of it. Also this patch makes both accept std::string_view not to limit the helpers in converting only sstring&-s. This changes calls for reverse_map template update with "heterogenuous lookup". Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-21 09:56:18 +03:00
Pavel Emelyanov	bb59dc2ec1	sstables: Implement fmt::formatter<sstable_format_types> Same as in previous patch for another enum-class type. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-21 09:56:18 +03:00
Pavel Emelyanov	6b04eb74d6	sstables: Implement fmt::formatter<sstable_version_types> This way the version type can be fed as-is into fmt:: code, respectively the conversion to string is as simple as fmt::to_string(v). So also drop the explicit existing to_string() helper updating all callers. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-21 09:56:18 +03:00
Pavel Emelyanov	ea1c6fbf98	sstables: Move format maps to namespace scope They will be used by fmt::formatter specification for version and format types in next patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-21 09:56:18 +03:00
Nadav Har'El	511308bccf	test/cql-pytest: tests for single-element multi-column restrictions It turns out that Cassandra handles a restriction like `(c2) = (1)` just like `c2 = 1`, and is not limited like multi-column restrictions. In particular, this query works despite missing "c1", and may also use an index if c2 is indexed. But currently in Scylla, `(c2) = (1)` is handled like a multi-column restriction, so complains if c2 is not the first clustering key column, and cannot use an index. This patch adds several tests demonstrating this difference between Scylla and Cassandra (#13250). The xfailing tests pass on Cassandra but fail on Scylla. Refs #13250 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13252	2023-03-21 07:56:24 +02:00
Anna Stuchlik	26bb36cdf5	doc: related https://github.com/scylladb/scylladb/issues/12754 ; add the missing information about reporting latencies to the upgrade guide 5.1 to 5.2 Closes #12935	2023-03-21 07:17:07 +02:00
Kefu Chai	faa47e9624	mutation: drop operator<<(ostream, const range_tombstone{_change,} &) as all of its callers have been removed, let's drop these two operators. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-21 11:37:07 +08:00
Kefu Chai	d146535ec6	mutation: use fmtlib to print range_stombstone{_change,} prepare for removing `operator<<(std::ostream&, const range_tombstone&)` and `operator<<(std::ostream& out, const range_tombstone_change&)`. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-21 11:37:07 +08:00
Kefu Chai	755aea8e7f	mutation: mutation_fragment_v2: specialize fmt::formatter<range_tombstone_change> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print range_tombstone_change without using ostream<<. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-21 11:37:07 +08:00
Kefu Chai	4af0a0ed19	mutation: range_tombstone: specialize fmt::formatter<range_tombstone> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print range_tombstone. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-21 11:37:07 +08:00
Daniel Reis	3d1c78bdcc	fix: links to php driver	2023-03-20 15:28:00 -03:00
Daniel Reis	f83f844319	fix: adding php versions into driver's description	2023-03-20 15:25:52 -03:00
Kefu Chai	b11fd28a46	dist/redhat: split Requires section into multiple lines for better readability Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-20 22:25:24 +08:00
Kefu Chai	7165551fd7	dist/redhat: enforce dependency on %{release} also s/%{version}/%{version}-%{release}/ in `Requires:` sections. this enforces the runtime dependencies of exactly the same releases between scylla packages. Fixes #13222 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-20 22:25:24 +08:00
Avi Kivity	0f97d464d3	Merge 'cql: check if the function is builtin when granting permissisons' from Wojciech Mitros Currently, when granting a permission on a funciton resource, we only check if the function exists, regardless of whether it's a user or a builtin function. We should not support altering permissions on builtin functions, so this patch adds a check for confirming that the found function is not builtin. Additionally, adjust an error exception thrown when trying to alter a permission that does not apply on a given resource Closes #13184 * github.com:scylladb/scylladb: cql: change exception type when granting incorrect permissions cql: check if the function is builtin when granting permissisons	2023-03-20 16:17:02 +02:00
Kefu Chai	476bd84dd0	config: add a space before parameter for better consistency in the code formatting. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13248	2023-03-20 16:03:00 +02:00
Botond Dénes	bf8b746bca	Merge 'utils: UUID: specialize fmt::formatter for UUID and tagged_uuid<>' from Kefu Chai this is a part of a series migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print UUID without using ostream<<. also, this change re-implements some formatting helpers using fmtlib for better performance and less dependencies on operator<<(), but we cannot drop it at this moment, as quite a few caller sites are still using operator<<(ostream&, const UUID&) and operator<<(ostream&, tagged_uuid<T>&). we will address them separately. * add `fmt::formatter<UUID>` * add `fmt::formatter<tagged_uuid<T>>` * implement `UUID::to_string()` using `fmt::to_string()` * implement `operator<<(std::ostream&, const UUID&)` with `fmt::print()`, this should help to improve the performance when printing uuid, as `fmt::print()` does not materialize a string when printing the uuid. * treewide: use fmtlib when printing UUID Refs #13245 Closes #13246 * github.com:scylladb/scylladb: treewide: use fmtlib when printing UUID utils: UUID: specialize fmt::formatter for UUID and tagged_uuid<>	2023-03-20 14:26:11 +02:00
Gleb Natapov	34d41177fe	storage_service: pass storage_proxy and system_distributed_keyspace objects to messaging initialization Will be needed there later. Message-Id: <20230316112801.1004602-14-gleb@scylladb.com>	2023-03-20 11:58:50 +01:00
Gleb Natapov	d8edd2055f	service: raft: add several accessors to group0 class They will be used by later patches. Message-Id: <20230316112801.1004602-13-gleb@scylladb.com>	2023-03-20 11:57:18 +01:00
Gleb Natapov	7d535a84bb	servers: raft: make remove_from_raft_config public Will be used by later patches. Message-Id: <20230316112801.1004602-11-gleb@scylladb.com>	2023-03-20 11:47:55 +01:00
Gleb Natapov	f017aa1ad3	service: raft: pass storage service to group0_state_machine To apply topology_change commands group0_state_machine needs to have an access to the storage service to support topology changes over raft. Message-Id: <20230316112801.1004602-10-gleb@scylladb.com>	2023-03-20 11:45:57 +01:00
Gleb Natapov	a690070722	raft_sys_table_storage: give initial snapshot a non zero value We create a snapshot (config only, but still), but do not assign it any id. Because of that it is not loaded on start. We do want it to be loaded though since the state of group0 will not be re-created from the log on restart because the entries will have outdated id and will be skipped. As a result in memory state machine state will not be restored. This is not a problem now since schema state it restored outside of raft code. Message-Id: <20230316112801.1004602-5-gleb@scylladb.com>	2023-03-20 11:45:38 +01:00
Gleb Natapov	2fc8e13dd8	raft: add server::wait_for_state_change() function Add a function that allows waiting for a state change of a raft server. It is useful for a user that wants to know when a node becomes/stops being a leader. Message-Id: <20230316112801.1004602-4-gleb@scylladb.com>	2023-03-20 11:31:55 +01:00
Gleb Natapov	59f7aeb79b	raft: move some functions out of ad-hoc section Make tick() and is_leader() part of the API. First is used externally already and another will be used in following patches. Message-Id: <20230316112801.1004602-3-gleb@scylladb.com>	2023-03-20 11:25:19 +01:00
Nadav Har'El	c550e681d7	test/rest_api: fix flaky test for toppartitions The REST test test_storage_service.py::test_toppartitions_pk_needs_escaping was flaky. It tests the toppartition request, which unfortunately needs to choose a sampling duration in advance, and we chose 1 second which we considered more than enough - and indeed typically even 1ms is enough! but very rarely (only know of only one occurance, in issue #13223) one second is not enough. Instead of increasing this 1 second and making this test even slower, this patch takes a retry approach: The tests starts with a 0.01 second duration, and is then retried with increasing durations until it succeeds or a 5-seconds duration is reached. This retry approach has two benefits: 1. It de-flakes the test (allowing a very slow test to take 5 seconds instead of 1 seconds which wasn't enough), and 2. At the same time it makes a successful test much faster (it used to always take a full second, now it takes 0.07 seconds on a dev build on my laptop). A failed test may, in some cases, take 10 seconds after this patch (although in some other cases, an error will be caught immediately), but I consider this acceptable - this test should pass, after all, and a failure indicates a regression and taking 10 seconds will be the last of our worries in that case. Fixes #13223. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13238	2023-03-20 11:32:53 +02:00
Kefu Chai	0ba6627d5c	wasm: block all signals in alien thread as in main(), we use `stop_signal` to handle SIGINT and SIGTERM, so when scylla receives a SIGTERM, the corresponding signal handler could get called on any threads created by this program. so there is chance that the alien_runner thread could be choosen to run the signal handler setup by `main()`, but that signal handler assumes the availability of Seastar reactor. unfortunately, we don't have a Seastar reactor in alien thread. the same applies to Seastar's `thread_pool` which handles the slow and blocking POSIX calls typically used for interacting with files. so, in this change, we use the same approach as Seastar's `thread_pool::work()` -- just block all signals, so the alien threads used by wasm for compiling UDF won't handle the signals using the handlers planted by `main()`. Fixes #13228 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13233	2023-03-20 11:20:19 +02:00
Avi Kivity	bab29a2f27	Merge 'Unit tests cleanup for sstable generation changes' from Benny Halevy This series cleans up unit test in preparation for PR #12994. Helpers are added (or reused) to not rely on specific sstable generation numbers where possible (other than loading reference sstables that are committed to the repo with given generation numbers), and to generate the sstables for tests easily, taking advantage of generation management in `sstable_test_env`, `table_for_tests`, or `replica::table` itself. Closes #13242 * github.com:scylladb/scylladb: test: add verify_mutation helpers. test: add make_sstable_containing memtable test: table_for_tests: add make_sstable function test: sstable_test_env: add make_sst_factory methods test: sstable_compaction_test: do not rely on specific generations tests: use make_sstable defaults as much as possible test: sstable_test_env: add make_table_for_tests test: sstable_datafile_test: do not rely on sepecific sstable generations test: sstable_test_env: add reusable_sst(shared_sstable) sstable: expose get_storage function test: mutation_reader_test: create_sstable: do not rely on specific generations test: mutation_reader_test: do_test_clustering_order_merger_sstable_set: rely on test_envsstable generation test: mutation_reader_test: combined_mutation_reader_test: define a local sst_factory function test: mutation_reader_test: do not use tmpdir test: use big format by default test: sstable_compaction_test: use highest sstable version by default test: test_env: make_db_config: set cfg host_id test: sstable_datafile_test: fixup indentation test: sstable_datafile_test: various tests: do_with_async test: sstable_3_x_test: validate_read, sstable_assertions: get shared_sstable test: sstable_3_x_test: compare_sstables: get shared_sstable test: sstable_3_x_test: write_sstables: return shared_sstable test: sstable_3_x_test: write, compare, validate_sstables: use env.tempdir test: sstable_3_x_test: compacted_sstable_reader: do not reopen compacted_sst test: lib: test_services: delete now unused stop_and_keep_alive test: sstable_compaction_test: use deferred_stop to stop table_for_tests test: sstable_compaction_test: compound_sstable_set_incremental_selector_test: do_with_async test: sstable_compaction_test: sstable_needs_cleanup_test: do_with_async test: sstable_compaction_test: leveled_05: fixup indentation test: sstable_compaction_test: leveled_05: do_with_async test: sstable_compaction_test: compact_02: do_with_async test: sstable_compaction_test: compact_sstables: simplify variable allocation test: sstable_compaction_test: compact_sstables: reindent test: sstable_compaction_test: compact_sstables: use thread test: sstable_compaction_test: sstable_rewrite: simplify variable allocation test: sstable_compaction_test: sstable_rewrite: fixup indentation test: sstable_compaction_test: sstable_rewrite: do_with_async test: sstable_compaction_test: compact: fixup indentation test: sstable_compaction_test: compact: complete conversion to async thread test: sstable_compaction_test: compaction_manager_basic_test: rename generations to idx	2023-03-20 11:16:46 +02:00
Nadav Har'El	8b0822be77	test/cql-pytest: reproducer for bug crashing Scylla on mismatched tuple This patch addes a reproducing test for issue #13241, where attempting a SELECT restriction (b,c,d) IN ((1,2)) - where the tuple is shorter than needed - crashes Scylla (on segmentation fault) instead of generating a clean error as it should (and as done on Cassandra). The test also demonstractes that if the tuple is longer than needed (instead of shorter), the behavior is correct, and it is also correct if "=" is used instead of IN. Only the combination of IN and too-short tuple seems to be broken - but broken in a bad way (can be used to crash Scylla). Because the test crashes Scylla when fails, it is marked "skip". Refs #13241 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13244	2023-03-20 11:13:02 +02:00
Anna Stuchlik	fc927b1774	doc: add the Enterprise vs. OSS Matrix Fixes https://github.com/scylladb/scylladb/issues/12758 This commit adds a new page with a matrix that shows on which ScyllaDB Open Source versions we based given ScyllaDB Enterprise versions. The new file is added to the newly created Reference section. Closes #13230	2023-03-20 10:18:10 +02:00
Kefu Chai	94c6df0a08	treewide: use fmtlib when printing UUID this change tries to reduce the number of callers using operator<<() for printing UUID. they are found by compiling the tree after commenting out `operator<<(std::ostream& out, const UUID& uuid)`. but this change alone is not enough to drop all callers, as some callers are using `operator<<(ostream&, const unordered_map&)` and other overloads to print ranges whose elements contain UUID. so in order to limit the scope of the change, we are not changing them here. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-20 15:38:45 +08:00
Kefu Chai	c14c70b89d	utils: UUID: specialize fmt::formatter for UUID and tagged_uuid<> this is a part of a series to migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print UUID without using ostream<<. also, this change reimplements some formatting helpers using fmtlib for better performance and less dependencies on operator<<(), but we cannot drop it at this moment, as quite a few caller sites are still using operator<<(ostream&, const UUID&) and operator<<(ostream&, tagged_uuid<T>&). we will address them separately. * add fmt::formatter<UUID> * add fmt::formatter<tagged_uuid<T>> * implement UUID::to_string() using fmt::to_string() * implement operator<<(std::ostream&, const UUID&) with fmt::print(), this should help to improve the performance when printing uuid, as fmt::print() does not materialize a string when printing the uuid. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-20 14:25:45 +08:00
Botond Dénes	583e49dd09	Merge 'cmake: sync with `configure.py` (14/n)' from Kefu Chai this is the 14rd changeset of a series which tries to give an overhaul to the CMake building system. this series has two goals: - to enable developer to use CMake for building scylla. so they can use tools (CLion for instance) with CMake integration for better developer experience - to enable us to tweak the dependencies in a simpler way. a well-defined cross module / subsystem dependency is a prerequisite for building this project with the C++20 modules. this changeset includes following changes: - build: cmake: promote add_scylla_test() to test/ - build: cmake: add all tests Closes #13220 * github.com:scylladb/scylladb: build: cmake: add all tests build: cmake: promote add_scylla_test() to test/	2023-03-20 08:13:07 +02:00
Pavel Emelyanov	c88e47a624	memory_data_sink: Add move ctor To make it possible to move the class member away resetting to be be empty at the same time. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13208	2023-03-20 07:55:20 +02:00
Pavel Emelyanov	b631081df8	test: Fixie for test sstable chdir Some unit tests want to change the sstable::_dir on the fly. However, the sstable::_dir is going away, so it needs a yet another virtual call on storage driver. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13213	2023-03-20 07:28:22 +02:00
Benny Halevy	d62df5cac6	test: add verify_mutation helpers. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 17:48:22 +02:00
Benny Halevy	cf4eaa1fbc	test: add make_sstable_containing memtable Helper for make_sstable + write_memtable_to_sstable_for_test + reusable_sst / load. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 17:48:22 +02:00
Benny Halevy	0ce6afb5f9	test: table_for_tests: add make_sstable function table_for_tests uses a sstables manager to generate sstables and gets the new generation from table.calculate_generation_for_new_table(). The version to use is either the highest supported or an ad-hoc version passed to make_sstable. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 17:48:22 +02:00
Benny Halevy	88d085ea66	test: sstable_test_env: add make_sst_factory methods The tests extensively use a `std::function<shared_sstable()>` to generate new tables. Rather than handcrafting them all over the place, let sstable_test_env return such factory given a schema (and another entry point that also gets a version) and that uses the embedded generation_factory in the test_env to generate new sstables with unique generations. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 17:48:22 +02:00
Benny Halevy	c308ba635b	test: sstable_compaction_test: do not rely on specific generations No need to maintain a static generation numbers in the test. Let the sstable_test_env dispatch sstable generations automatically And use the generated sstable themselves for reference rather than their generation numbers. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 17:47:46 +02:00
Benny Halevy	51b2c38472	tests: use make_sstable defaults as much as possible Add a few goodies to sstable_test_env to extend entry points with default params for make_sstable and reusable_sst. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 17:47:14 +02:00
Benny Halevy	084f4e4fde	test: sstable_test_env: add make_table_for_tests Wrap table_for_tests ctor to pass the env sstables_manager as well as the temporary directory path, as this is the most common use case, and in preparation for adding a make_sstable method in table_for_tests. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 17:47:14 +02:00
Benny Halevy	e9af4e4cd8	test: sstable_datafile_test: do not rely on sepecific sstable generations There is no need to use specific generations in the test, just rely on the ones sstable_test_env generates. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 17:46:47 +02:00
Benny Halevy	94192f0ded	test: sstable_test_env: add reusable_sst(shared_sstable) Allow generating a sstable object from an existing sstable to get the directory, generation, and version from it, rather than passing them to reusable_sst from other sources - since the intention is to get a new sstable object based on an existing sstable that was generated by the test. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 17:20:07 +02:00
Benny Halevy	b11e2c81ae	sstable: expose get_storage function To be used by sstable_test_env to reopen existing sstables. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 17:19:12 +02:00
Benny Halevy	e9c3f0e478	test: mutation_reader_test: create_sstable: do not rely on specific generations No need to maintain a static generation numbers in the test. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	648ab706df	test: mutation_reader_test: do_test_clustering_order_merger_sstable_set: rely on test_envsstable generation Rather than maintaining a running generation number, use the default env.make_sstable(s) in sst_factory and collect the expected generations from the resulting shared sstable. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	11595b3024	test: mutation_reader_test: combined_mutation_reader_test: define a local sst_factory function For generating shared_sstables with increasing generations (using the test_env make_sstable generations) and a given level. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	506dc1260f	test: mutation_reader_test: do not use tmpdir Rely on the test_env temporary directory instead. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	ceb5d4fb47	test: use big format by default No need to pass the big format explicitly as it's set by default by make_sstable and it is never overriden. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	f24b69a6ae	test: sstable_compaction_test: use highest sstable version by default Tests should just generate the highest sstable version available. There is no need to ontinue testing old versions, in particular partially supported ones like "la". Use also the default values for sstable::format_types, buffer_size, etc. if there's no particular need to override them. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	df5347fca8	test: test_env: make_db_config: set cfg host_id So we can safely use `me` sstables in sstable_directory_test that validates the sstable host owner. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	8b168869be	test: sstable_datafile_test: fixup indentation Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	1fce7c76a5	test: sstable_datafile_test: various tests: do_with_async To simplify further cleanups. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	2954feb734	test: sstable_3_x_test: validate_read, sstable_assertions: get shared_sstable Pass the test-generated shared_sstable to validate_read and then to sstable_assertions so it can be used for make_sstable version and generation params. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	969ec8611e	test: sstable_3_x_test: compare_sstables: get shared_sstable Use the sstable generated by the test to generate the result_filename we want for compare. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	3ba0d1659c	test: sstable_3_x_test: write_sstables: return shared_sstable To be pssed to compare_sstable in the next patch, so it can generate to result filename out of it. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	4c842fb0e8	test: sstable_3_x_test: write, compare, validate_sstables: use env.tempdir Do not create a tmpdir every time, just use the one that the sstable test env provides. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	71c0c713ee	test: sstable_3_x_test: compacted_sstable_reader: do not reopen compacted_sst Just use the one we created during compaction for verification so we won't have to rely on a particular generation/version. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	e385575407	test: lib: test_services: delete now unused stop_and_keep_alive Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	0bf60d42aa	test: sstable_compaction_test: use deferred_stop to stop table_for_tests Rather than calling cf.stop_and_keep_alive() before the test exits. since it must be stopped also on failure. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	208726d987	test: sstable_compaction_test: compound_sstable_set_incremental_selector_test: do_with_async Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	9d83a94c28	test: sstable_compaction_test: sstable_needs_cleanup_test: do_with_async Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	d8a354a35e	test: sstable_compaction_test: leveled_05: fixup indentation Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	8b8c1c5813	test: sstable_compaction_test: leveled_05: do_with_async Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	d1879a5932	test: sstable_compaction_test: compact_02: do_with_async Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	76799d08d6	test: sstable_compaction_test: compact_sstables: simplify variable allocation No need to use lw_shared all over the place now that the function ises a seastar thread. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	af106684ae	test: sstable_compaction_test: compact_sstables: reindent Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	8de808ff15	test: sstable_compaction_test: compact_sstables: use thread Prepare for using make_sstable_containing in a follow up patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	f4989f2ba5	test: sstable_compaction_test: sstable_rewrite: simplify variable allocation No need to use lw_shared all over the place now that the function ises a seastar thread. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	fb379709cf	test: sstable_compaction_test: sstable_rewrite: fixup indentation Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	b27910cff2	test: sstable_compaction_test: sstable_rewrite: do_with_async simplify flow using seastar thread. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	d1a112a156	test: sstable_compaction_test: compact: fixup indentation Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	d503eb75f1	test: sstable_compaction_test: compact: complete conversion to async thread We already use test_env::do_with_async in this function but we didn't take full advantage of it to simplify the implementation. Do that before further changes are made. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:53:56 +02:00
Benny Halevy	237c844901	test: sstable_compaction_test: compaction_manager_basic_test: rename generations to idx The function used `calculate_generation_for_new_table` for the sstables generation. The so-called `generations` are just used to generate key indices. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-19 16:52:21 +02:00
Botond Dénes	9859bae54f	Merge 'Ignore no such column family in repair' from Aleksandra Martyniuk While repair requested by user is performed, some tables may be dropped. When the repair proceeds to these tables, it should skip them and continue with others. When no_such_column_family is thrown during user requested repair, it is logged and swallowed. Then the repair continues with the remaining tables. Fixes: #13045 Closes #13068 * github.com:scylladb/scylladb: repair: fix indentation repair: continue user requested repair if no_such_column_family is thrown repair: add find_column_family_if_exists function	2023-03-19 15:16:02 +02:00
Botond Dénes	b1c7538e92	Merge 'Give table a reference to storage_options' from Pavel Emelyanov The `storage_options` describes where sstables should be located. Currently the object reside on keyspace_metadata, but is thus not available at the place it's needed the most -- the `table::make_sstable()` call. This set converts keyspace_metadata::storage_opts to be lw-shared-ptr and shares the ptr with class table. refs: #12523 (detached small change from large PR) Closes #13212 * github.com:scylladb/scylladb: table: Keep storage options lw-shared-ptr keyspace_metadata: Make storage options lw-shared-ptr	2023-03-19 15:16:02 +02:00
Avi Kivity	a7099132cc	scripts/pull_github_pr.sh: optionally authenticate This helps overcome rate limits for unauthenticated requests, preventing maintainers from getting much-needed rest. Closes #13210	2023-03-19 15:16:02 +02:00
Kefu Chai	c5b6c91412	db: data_listener: mark data_listener's dtor virtual Clang-17 warns when we tries to delete a pointer to a class with virtual function(s) but without marking its dtor virtual. in this change, we mark the dtor of the base class of `table_listener` virtual to address the warning. we have another solution though -- to mark `table_listener` `final`. as we don't destruct `table_listener` with a pointer to its base classes. but it'd be much simpler to just mark the dtor virtual of its base class with virtual method(s). it's much idiomatic this way, and less error-prune. this change should silence the warning like: ``` In file included from /home/kefu/dev/scylladb/test/boost/data_listeners_test.cc:9: In file included from /usr/include/boost/test/unit_test.hpp:18: In file included from /usr/include/boost/test/test_tools.hpp:46: In file included from /usr/include/boost/test/tools/old/impl.hpp:20: In file included from /usr/include/boost/test/tools/assertion_result.hpp:21: In file included from /usr/include/boost/shared_ptr.hpp:17: In file included from /usr/include/boost/smart_ptr/shared_ptr.hpp:17: In file included from /usr/include/boost/smart_ptr/detail/shared_count.hpp:27: In file included from /usr/include/boost/smart_ptr/detail/sp_counted_impl.hpp:35: In file included from /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/memory:78: /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/unique_ptr.h:100:2: error: delete called on non-final 'table_listener' that has virtual functions but non-virtual destructor [-Werror,-Wdelete-non-abstract-non-virtual-dtor] delete __ptr; ^ /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/unique_ptr.h:405:4: note: in instantiation of member function 'std::default_delete<table_listener>::operator()' requested here get_deleter()(std::move(__ptr)); ^ /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/stl_construct.h:88:15: note: in instantiation of member function 'std::unique_ptr<table_listener>::~unique_ptr' requested here __location->~_Tp(); ^ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13198	2023-03-19 15:16:02 +02:00
Kefu Chai	a01eb593ec	test: sstables: do not compare a mutation with an optional<mutation> this change should address the FTBFS with Clang-17. turns out we are comparing a mutation with an optimized_optional<mutation>. and Clang-17 does not want to convert the LHS, which is a mutation to optimized_optional<mutation> for performing the comparison using operator==(const optimized_optional<mutation>&), desipte that optimized_optional(const T& obj) is not marked explicit. this is understandable. so, in this change, instead of relying on the implicit conversion, we just * check if the optional actually holds a value * and compare the value by deferencing the optional. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13196	2023-03-19 15:16:02 +02:00
Pavel Emelyanov	be548a4da3	install-dependencies: Add rapid XML dev package It will be needed by S3 driver to parse multipart-upload messages from server refs: #12523 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13158 [avi: regenerate toolchain] Closes #13192	2023-03-19 15:16:02 +02:00
Avi Kivity	c3a2ec9d3c	Merge 'use fmt::join() for printing ranges' from Kefu Chai this series intends to deprecate `::join()`, as it always materializes a range into a concrete string. but what we always want is to print the elements in the given range to stream, or to a seastar logger, which is backed by fmtlib. also, because fmtlib offers exactly the same set of features implemented by to_string.hh, this change would allow us to use fmtlib to replace to_string.hh for better maintainability, and potentially better performance. as fmtlib is lazy evaluated, and claims to be performant under most circumstances. Closes #13163 * github.com:scylladb/scylladb: utils: to_string: move join to namespace utils treewide: use fmt::join() when appropriate row_cache: pass "const cache_entry" to operator<<	2023-03-19 15:16:02 +02:00
Wojciech Mitros	3cdaf72065	docs: fix minor issues found in the wasm documentation Even after last fixups, the documentation still had some issues with compilation instructions in particular. I also ran a spelling and grammar check on the text, and fixed issues found by it. Closes #13206	2023-03-19 15:16:02 +02:00
Botond Dénes	6a8fbbebf2	test/boost/reader_concurrency_semaphore_test: remove redundant stats printouts The semaphore stats are now included in the standard semaphore diagnostics printout, no need to dump separately.	2023-03-17 03:15:41 -04:00
Botond Dénes	d6583cad0a	reader_concurrency_semaphore: do_dump_reader_permit_diagnostics(): print the stats Print the semaphore stats below the permit listing and remove the currently redundant "Total: " line. Some of the stats printed here are already exported as metrics, but instead of trying to cherry-pick and risk some metrics falling through the cracks, just print everything, there aren't that many anyway.	2023-03-17 03:15:41 -04:00
Botond Dénes	7b701ac52e	reader_concurrency_semaphore: add stats to record reason for queueing permits When diagnosing problems, knowing why permits were queued is very valuable. Record the reason in a new stats, one for each reason a permit can be queued.	2023-03-17 03:15:41 -04:00
Botond Dénes	bb00405818	reader_concurrency_semaphore: can_admit_read(): also return reason for rejection So caller can bump the appropriate counters or log the reason why the the request cannot be admitted.	2023-03-17 03:15:40 -04:00
Kefu Chai	f113dac5bf	build: cmake: add all tests * add a new test KIND "UNIT", which provides its own main() * add all tests which were not included yet Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-17 12:56:09 +08:00
Kefu Chai	b440417527	build: cmake: promote add_scylla_test() to test/ as it will be used by test/manual/CMakeLists.txt also. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-17 12:56:09 +08:00
Daniel Reis	86a4b8a57d	docs: scylladb better php driver	2023-03-16 17:00:22 -03:00
Wojciech Mitros	53af79442d	cql: change exception type when granting incorrect permissions For compatibility with Cassandra, this patch changes the exception type thrown when trying to alter a permission that is not applicable on the given resource from an Invalid query to a Syntax exception.	2023-03-16 16:43:37 +01:00
Wojciech Mitros	9c36c0313a	cql: check if the function is builtin when granting permissisons Currently, when granting a permission on a funciton resource, we only check if the function exists, regardless of whether it's a user or a builtin function. We should not support altering permissions on builtin functions, so this patch adds a check for confirming that the found function is not builtin.	2023-03-16 16:43:32 +01:00
Pavel Emelyanov	e882269d93	table: Keep storage options lw-shared-ptr Tables need to know which storage their sstables need to be located at, so class table needs to have itw reference of the storage options. The thing can be inherited from the keyspace metadata. Tests sometimes create table without keyspace at hand. For those use default-initialized storage options (which is local storage). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-16 17:30:45 +03:00
Pavel Emelyanov	c619a53c61	keyspace_metadata: Make storage options lw-shared-ptr Today the storage options are embedded into metadata object. In the future the storage options will need to be somehow referenced by the class table too. Using plan reference doesn't look safe, turn the storage options into lw-shared-ptr instead. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-16 17:30:45 +03:00
Kefu Chai	93fa70069c	utils: to_string: move join to namespace utils `join` can easily be confused with boost::algorithm::join so make it more visible that we're using scylla's utils implementation. Also, move `struct print_with_comma` to utils::internal. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-16 20:34:18 +08:00
Kefu Chai	c37f4e5252	treewide: use fmt::join() when appropriate now that fmtlib provides fmt::join(). see https://fmt.dev/latest/api.html#_CPPv4I0EN3fmt4joinE9join_viewIN6detail10iterator_tI5RangeEEN6detail10sentinel_tI5RangeEEERR5Range11string_view there is not need to revent the wheel. so in this change, the homebrew join() is replaced with fmt::join(). as fmt::join() returns an join_view(), this could improve the performance under certain circumstances where the fully materialized string is not needed. please note, the goal of this change is to use fmt::join(), and this change does not intend to improve the performance of existing implementation based on "operator<<" unless the new implementation is much more complicated. we will address the unnecessarily materialized strings in a follow-up commit. some noteworthy things related to this change: * unlike the existing `join()`, `fmt::join()` returns a view. so we have to materialize the view if what we expect is a `sstring` * `fmt::format()` does not accept a view, so we cannot pass the return value of `fmt::join()` to `fmt::format()` * fmtlib does not format a typed pointer, i.e., it does not format, for instance, a `const std::string`. but operator<<() always print a typed pointer. so if we want to format a typed pointer, we either need to cast the pointer to `void` or use `fmt::ptr()`. * fmtlib is not able to pick up the overload of `operator<<(std::ostream& os, const column_definition* cd)`, so we have to use a wrapper class of `maybe_column_definition` for printing a pointer to `column_definition`. since the overload is only used by the two overloads of `statement_restrictions::add_single_column_parition_key_restriction()`, the operator<< for `const column_definition*` is dropped. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-16 20:34:18 +08:00
Wojciech Mitros	aad2afd417	rust: update dependencies Cranelift-codegen 0.92.0 and wasmtime 5.0.0 have security issues potentially allowing malicious UDFs to read some memory outside the wasm sandbox. This patch updates them to versions 0.92.1 and 5.0.1 respectively, where the issues are fixed. Fixes #13157 Closes #13171	2023-03-16 13:45:53 +02:00
Takuya ASADA	a79604b0d6	create-relocatable-package.py: exclude tools/cqlsh We should exclude tools/cqlsh for relocatable package. fixes #13181 Closes #13183	2023-03-16 13:37:16 +02:00
Anna Stuchlik	d00926a517	doc: Add version 5.2 to the version selector This commit adds branch-5.2 to the list of branches for which we want to build the docs. As a result, version 5.2 will be added to the version selector. NOTE: Version 5.2 will be marked as unstable and an appropriate message will be shown to the user. After 5.2 is released, branch-5.2 needs to be moved from UNSTABLE_VERSIONS to LATEST_VERSION (where is should replace branch-5.1) Closes #13200	2023-03-16 10:46:30 +02:00
Kamil Braun	b919373cce	Merge 'api: gossiper: get alive nodes after reaching current shard 0 version' from Alecco Add an API call to wait for all shards to reach the current shard 0 gossiper version. Throws when timeout is reached. Closes #12540 * github.com:scylladb/scylladb: api: gossiper: fix alive nodes gms, service: lock live endpoint copy gms, service: live endpoint copy method	2023-03-16 09:46:02 +01:00
Botond Dénes	b31a55af7e	Merge 'cmake: sync with `configure.py` (13/n)' from Kefu Chai this is the 13rd changeset of a series which tries to give an overhaul to the CMake building system. this series has two goals: - to enable developer to use CMake for building scylla. so they can use tools (CLion for instance) with CMake integration for better developer experience - to enable us to tweak the dependencies in a simpler way. a well-defined cross module / subsystem dependency is a prerequisite for building this project with the C++20 modules. this changeset includes following changes: - build: cmake: increase per link job mem to 4GiB - build: cmake: add missing sources to test-lib - build: cmake: add more tests - build: cmake: remote quotes in "include()" commands - build: cmake: drop unnecessary linkages Closes #13199 * github.com:scylladb/scylladb: build: cmake: drop unnecessary linkages build: cmake: remote quotes in "include()" commands build: cmake: add more tests build: cmake: add missing sources to test-lib build: cmake: increase per link job mem to 4GiB	2023-03-16 10:40:18 +02:00
Nadav Har'El	c5195e0acd	cql-pytest: add reproducers for GROUP BY bugs The translated Cassandra unit tests in cassandra_tests/validation/operations/ reproduced three bugs in GROUP BY's interaction with LIMIT and PER PARTITION LIMIT - issue #5361, #5362 and #5363. Unfortunately, those test functions are very long, and each test fails on all of these issues and a few more, making it difficult to use these tests to verify when those tests have been fixed. In other words, ideally a patch for issue 5361 should un-xfail some reproducing test for this issue - but all the existing tests will continue to fail after fixing 5361, because of other remaining bugs. So in this patch, I created a new test file test_group_by.py with my own tests for the GROUP BY feature. I tried to explore the different capabilities of the GROUP BY feature, its different success and error paths, and how GROUP BY interacts with LIMIT and PER PARTITION LIMIT. As usual, I created many small test functions and not one huge test function, and as a result we now have 5 xfailing tests which each reproduces one bug and when the bug is fixed, it will start to pass. All tests added here pass on Cassandra. Refs #5361 Refs #5362 Refs #5363 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13136	2023-03-16 10:39:05 +02:00
Botond Dénes	f4b5679804	Merge 'doc: Updates the recommended OS to be Ubuntu 22.04' from Anna Stuchlik Fixes https://github.com/scylladb/scylladb/issues/13138 Fixes https://github.com/scylladb/scylladb/issues/13153 This PR: - Fixes outdated information about the recommended OS. Since version 5.2, the recommended OS should be Ubuntu 22.04 because that OS is used for building the ScyllaDB image. - Adds the OS support information for version 5.2. This PR (both commits) needs to be backported to branch-5.2. Closes #13188 * github.com:scylladb/scylladb: doc: Add OS support for version 5.2 doc: Updates the recommended OS to be Ubuntu 22.04	2023-03-16 08:05:19 +02:00
Kefu Chai	0069b43fd4	build: cmake: drop unnecessary linkages most of the linked libraries should be pulled in by the targets defined by subsystems. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-16 12:14:21 +08:00
Kefu Chai	681dfac496	build: cmake: remote quotes in "include()" commands more consistent this way. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-16 12:14:21 +08:00
Kefu Chai	03f5f788a3	build: cmake: add more tests all tests under test/boost are now buildable. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-16 12:14:21 +08:00
Kefu Chai	649a31a722	build: cmake: add missing sources to test-lib Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-16 12:14:21 +08:00
Kefu Chai	8963fe4e41	build: cmake: increase per link job mem to 4GiB lld is multi-threaded in some phases, based on observation, it could spawn up to 16 threads for each link job. and each job could take up to more than 3 GiB memory in total. without the change, we can run into OOM with a machine without abundant memory, so increase the per-link-job mem accordingly. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-16 12:14:21 +08:00
Kefu Chai	9eb2626fec	row_cache: pass "const cache_entry" to operator<< operator<<(..) does not mutate the cache_entry parameter passed to it. also, without this change fmtlib is not able to format given cache_entry parameter, as the caller formatter has "const" specifier. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-16 07:46:11 +08:00
Avi Kivity	7a5e609d8d	cql3: functions: add helpers for automating marshalling for scalar functions Add a helper that, given a C++ function, deduces its arguument types and wraps the function in marshalling/unmarshalling code. The native function expects non-null inputs, so an additional helper is called to decide what to do if nulls are encountered. One such helper is return_accumulator_on_null (since that's the default behavior of aggregates), and the other is return_any_nonnull(), useful for reductions.	2023-03-15 22:28:41 +02:00
Avi Kivity	35dd3edb9e	types: fix big_decimal constructor from literal 0 Currently, big_decimal(0) will select the big_decimal(string_view) constructor (via 0 -> const char* -> string_view conversions). 0 is important for initializing aggregates, so fix it ahead of using it.	2023-03-15 22:24:12 +02:00
Avi Kivity	6c8d942fa1	cql3: functions: add helper class for internal scalar functions We'll need many scalar functions to implement aggregates in terms of scalars, so we add an internal_scalar_function class to reduce boilerplate. The new class proxies the scalar function into a native noncopyable_function provided by the constructor.	2023-03-15 22:22:02 +02:00
Avi Kivity	26e8ec663b	db: functions: add stateless aggregate functions Currently, aggregate functions are implemented in a statefull manner. The accumulator is stored internally in an aggregate_function::aggregate, requiring each query to instantiate new instances (see aggregate_function_selector's constructor, and note how it's called from selector::new_instance()). This makes aggregates hard to use in expressions, since expressions are stateless (with state only provided to evaluate()). To facilitate migration towards stateless expressions, we define a stateless_aggregate_function (modelled after user-defined aggregates, which are already stateless). This new struct defines the aggregate in terms of three scalar functions: one to aggregate a new input into an accumulator (provided in the first parameter), one to finalize an accumulator into a result, and one to reduce two accumulators for parallelized aggregation. An adapter of the new struct to the aggregate_function interface is also provided, to allow for incremental migration in the following patches.	2023-03-15 22:10:23 +02:00
Avi Kivity	82c4341e0e	db, cql3: move scalar_function from cql3/functions to db/functions Previously, we moved cql3::functions::function to the db::functions namespace, since functions are a part of the data dictionary, which is independent of cql3. We do the same now for scalar_function, since we wish to make use of it in a new db::functions::stateless_aggregate_function. A stub remains in cql3/functions to avoid churn.	2023-03-15 20:37:25 +02:00
Avi Kivity	29a2788b2e	Merge 'reader_concurrency_semaphore: handle read blocked on memory being registered as inactive' from Botond Dénes A read that requested memory and has to wait for it can be registered as inactive. This can happen for example if the memory request originated from a background I/O operation (a read-ahead maybe). Handling this case is currently very difficult. What we want to do is evict such a read on-the-spot: the fact that there is a read waiting on memory means memory is in demand and so inactive reads should be evicted. To evict this reader, we'd first have to remove it from the memory wait list, which is almost impossible currently, because `expiring_fifo<>`, the type used for the wait list, doesn't allow for that. So in this PR we set out to make this possible first, by transforming all current queues to be intrusive lists of permits. Permits are already linked into an intrusive list, to allow for enumerating all existing permits. We use these existing hooks to link the permits into the appropriate queue, and back to `_permit_list` when they are not in any special queue. To make this possible we first have to make all lists store naked permits, moving all auxiliary data fields currently stored in wrappers like `entry` into the permit itself. With this, all queues and lists in the semaphore are intrusive lists, storing permits directly, which has the following implications: * queues no longer take extra memory, as all of them are intrusive * permits are completely self-sufficient w.r.t to queuing: code can queue or dequeue permits just with a reference to a permit at hand, no other wrapper, iterator, pointer, etc. is necessary. * queues don't keep permits alive anymore; destroying a permit will automatically unlink it from the respective queue, although this might lead to use-after-free. Not a problem in practice, only one code-path (`reader_concurrenc_semaphore::with_permit()`) had to be adjusted. After all that extensive preparations, we can now handle the case of evicting a reader which is queued on memory. Fixes: #12700 Closes #12777 * github.com:scylladb/scylladb: reader_concurrency_semaphore: handle reader blocked on memory becoming inactive reader_concurrency_semaphore: move _permit_list next to the other lists reader_permit: evict inactive read on timeout reader_concurrency_semaphore: move inactive_read to .cc reader_concurrency_semaphore: store permits in _inactive_reads reader_concurrency_semaphore: inactive_read: de-inline more methods reader_concurrency_semaphore: make _ready_list intrusive reader_permit: add wait_for_execution state reader_concurrency_semaphore: make wait lists intrusive reader_concurrency_semaphore: move most wait_queue methods out-of-line reader_concurrency_semaphore: store permits directly in queues reader_permit: introduce (private) operator * and -> reader_concurrency_semaphore: remove redundant waiters() member reader_concurrency_semaphore: add waiters counter reader_permit: use check_abort() for timeout reader_concurrency_semaphore: maybe_dump_permit_diagnostics(): remove permit list param reader_concurrency_semaphroe: make foreach_permit() const reader_permit: add get_schema() and get_op_name() accessors reader_concurrency_semaphore: mark maybe_dump_permit_diagnostics as noexcept	2023-03-15 20:10:19 +02:00
Wojciech Mitros	b776cb4b41	docs: fix typos in wasm documentation This patch fixes 2 small issues with the Wasm UDF documentation that recently got uploaded: 1. a link was unnecessarily wrapped in angle brackets 2. a link did not redirect to the correct page due to a missing ":doc:" tag Closes #13193	2023-03-15 18:48:48 +02:00
Anna Stuchlik	3ad3259396	doc: Add OS support for version 5.2 Fixes https://github.com/scylladb/scylladb/issues/13153 This commit adds a row for version 5.2 to the table of supported platforms.	2023-03-15 16:12:41 +01:00
Kamil Braun	5705df77a1	Merge 'Refactor schema, introduce schema_static_props and move several properties into it' from Gusev Petr Our end goal (#12642) is to mark raft tables to use schema commitlog. There are two similar cases in code right now - `with_null_sharder` and `set_wait_for_sync_to_commitlog` `schema_builder` methods. The problem is that if we need to mark some new schema with one of these methods we need to do this twice - first in a method describing the schema (e.g. `system_keyspace::raft()`) and second in the function `create_table_from_mutations`, which is not obvious and easy to forget. `create_table_from_mutations` is called when schema object is reconstructed from mutations, `with_null_sharder` and `set_wait_for_sync_to_commitlog` must be called from it since the schema properties they describe are not included in the mutation representation of the schema. This series proposes to distinguish between the schema properties that get into mutations and those that do not. The former are described with `schema_builder`, while for the latter we introduce `schema_static_props` struct and the `schema_builder::register_static_configurator` method. This way we can formulate a rule once in the code about which schemas should have a null sharder/be synced, and it will be enforced in all cases. Closes #13170 * github.com:scylladb/scylladb: schema.hh: choose schema_commitlog based on schema_static_props flag schema.hh: use schema_static_props for wait_for_sync_to_commitlog schema.hh: introduce schema_static_props, use it for null_sharder database.cc: drop ensure_populated and mark_as_populated	2023-03-15 15:43:49 +01:00
Kefu Chai	e21926f602	flat_mutation_reader_v2: use maybe_yield() when appropriate just came across this part of code, as `maybe_yield()` is a wrapper around "if should_yield(): yield()", so better off using it for more concise code. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13107	2023-03-15 15:58:55 +02:00
Anna Stuchlik	1bb11126d7	doc: Updates the recommended OS to be Ubuntu 22.04 Fixes https://github.com/scylladb/scylladb/issues/13138 This PR fixes the outdated information about the recommended OS. Since version 5.2, the recommended OS should be Ubuntu 22.04 because that OS is used for building the ScyllaDB image. This commit needs to be backported to branch-5.2.	2023-03-15 13:42:37 +01:00
Pavel Emelyanov	47cdd31f27	main: Forget the --max-io-requests option On start scylla checks if the option is set. It's nowadays useless, as it had been removed from seastar (see `9e34779c` update) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13148	2023-03-15 12:42:06 +02:00
Botond Dénes	e5f3f4b0d1	Merge 'cmake: sync with `configure.py` (12/n)' from Kefu Chai this is the 12nd changeset of a series which tries to give an overhaul to the CMake building system. this series has two goals: - to enable developer to use CMake for building scylla. so they can use tools (CLion for instance) with CMake integration for better developer experience - to enable us to tweak the dependencies in a simpler way. a well-defined cross module / subsystem dependency is a prerequisite for building this project with the C++20 modules. this changeset includes following changes: - build: cmake: remove Seastar from the option name - build: cmake: add missing sources in test-lib and utils - build: cmake: do not include main.cc in scylla-main - build: cmake: define SEASTAR_TESTING_MAIN for SEASTAR tests - build: cmake: add more tests Closes #13180 * github.com:scylladb/scylladb: build: cmake: add more tests build: cmake: define SEASTAR_TESTING_MAIN for SEASTAR tests build: cmake: do not include main.cc in scylla-main build: cmake: add missing sources in test-lib and utils build: cmake: remove Seastar from the option name	2023-03-15 12:40:51 +02:00
Nadav Har'El	543d4ed726	cql-pytest: translate Cassandra's tests for GROUP BY This is a translation of Cassandra's CQL unit test source file validation/operations/SelectGroupByTest.java into our cql-pytest framework. This test file contains only 8 separate test functions, but each of them is very long checking hundreds of different combinations of GROUP BY with other things like LIMIT, ORDER BY, etc., so 6 out of the 7 tests fail on Scylla on one of the bugs listed below - most of the tests actually fail in multiple places due to multiple bugs. All tests pass on Cassandra. The tests reproduce six already-known Scylla issues and one new issue: Already known issues: Refs #2060: Allow mixing token and partition key restrictions Refs #5361: LIMIT doesn't work when using GROUP BY Refs #5362: LIMIT is not doing it right when using GROUP BY Refs #5363: PER PARTITION LIMIT doesn't work right when using GROUP BY Refs #12477: Combination of COUNT with GROUP BY is different from Cassandra in case of no matches Refs #12479: SELECT DISTINCT should refuse GROUP BY with clustering column A new issue discovered by these tests: Refs #13109: Incorrect sort order when combining IN, GROUP BY and ORDER BY Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13126	2023-03-15 12:40:24 +02:00
Pavel Emelyanov	bfc0533a8d	test: Update boost.suite.run_first list In debug mode the timings are: view_schema_test: 90 sec cql_query_test: 170 sec memtable_test: 2090 sec cql_functions_test: 2591 sec other tests that are in/out of this list are not that obvious, but the former two apparently deserve being replaced with the latter two. Timings for dev/release modes are not that horrible, but the "first pair is notably smaller than the latter" relation also exists. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13142	2023-03-15 12:10:50 +02:00
Botond Dénes	878ee27d74	Merge 'Load SSTable at the shard that actually own it' from Raphael "Raph" Carvalho Today, the SSTable generation provides a hint on which shard owns a particular SSTable. That hint determines which shard will load the SSTable into memory. With upcoming UUID generation, we will no longer have this hint embedded into the SSTable generation, meaning that SSTables will be loaded at random shards. This is not good because shards will have to reference memory from other shards to access the SSTable metadata that was allocated elsewhere. This patch changes sstable_directory to: 1) Use generation value to only determine which shard will calculate the owner shards for SSTables. Essentially works like a round-robin distribution. 2) The shard assigned to compute the owners for a SSTable will do so reading the minimum from disk, usually only Scylla file is needed. 3) Once that shard finished computing the owners, it will forward the SSTable to the shard that own it. 4) Shards will later load SSTables locally that were forwarded to them. Closes #13114 * github.com:scylladb/scylladb: sstables: sstable_directory: Load SSTable at the shard that actually own it sstables: sstable_directory: Give sstable_info_vector a more descriptive name sstables: Allow owner shards to be computed for a partially loaded SSTable sstables: Move SSTable loading to sstable_directory::sort_sstable() sstables: Move sstable_directory::sort_sstable() to private interface sstables: Restore indentation in sstable_directory::sort_sstable() sstables: Coroutinize sstable_directory::sort_sstable() sstables: sstable_directory: Extract sstable loading from process_descriptor() sstables: sstable_directory: Separate private fields from methods sstables: Coroutinize sstable_directory::process_descriptor	2023-03-15 10:43:22 +02:00
Kefu Chai	4505b0a9ca	build: cmake: add more tests * test/boost: add more tests: all tests listed in test/boost/CMakeLists.txt should build now. * rust: add inc library, which is used for testing. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-15 15:38:47 +08:00
Kefu Chai	cac6ba529d	build: cmake: define SEASTAR_TESTING_MAIN for SEASTAR tests we need the `main()` defined by seastar/testing/seastar_test.hh for driving the tests. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-15 15:38:47 +08:00
Kefu Chai	d9e3ffebf2	build: cmake: do not include main.cc in scylla-main main.cc should only be included by scylla. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-15 15:38:46 +08:00
Kefu Chai	1cd3764b08	build: cmake: add missing sources in test-lib and utils Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-15 15:38:46 +08:00
Kefu Chai	269cce4c2c	build: cmake: remove Seastar from the option name change the option name to "LINK_MEM_PER_JOB" as this is not a Seastar option, but a top-level project option. Signed-off-by: Kefu Chai <tchaikov@gmail.com>	2023-03-15 15:38:46 +08:00
Michał Chojnowski	866672a9fa	storage_proxy: rename metrics after service level rename Under some circumstances, service_level_controller renames service levels for internal purposes. However, the per-service-level metrics registered by storage_proxy keep the name seen at first registration time. This sometimes leads to mislabeled metrics. Fix that by re-registering the metrics after scheduling groups are renamed. Fixes scylladb/scylla-enterprise#2755 Closes #13174	2023-03-15 09:15:54 +02:00
Botond Dénes	6373452b31	Merge 'Do not mask node operation errors' from Benny Halevy This series handles errors when aborting node operations and prints them rather letting them leak and be exposed to the user. Also, cleanup the node_ops logging formats when aborting different node ops and add more error logging around errors in the "worker" nodes. Closes #12799 * github.com:scylladb/scylladb: storage_service: node_ops_signal_abort: print a warning when signaling abort storage_service: s/node_ops_singal_abort/node_ops_signal_abort/ storage_service: node_ops_abort: add log messages storage_service: wire node_ops_ctl for node operations storage_service: add node_ops_ctl class to formalize all node_ops flow repair: node_ops_cmd_request: add print function repair: do_decommission_removenode_with_repair: log ignore_nodes repair: replace_with_repair: get ignore_nodes as unordered_set gossiper: get_generation_for_nodes: get nodes as unordered_set storage_service: don't let node_ops abort failures mask the real error	2023-03-15 09:11:31 +02:00
Petr Gusev	afe1d39bdb	schema.hh: choose schema_commitlog based on schema_static_props flag This patch finishes the refactoring. We introduce the use_schema_commitlog flag in schema_static_props and use it to choose the commitlog in database::add_column_family. The only configurator added declares what was originally in database::add_column_family - all tables from schema_tables keyspace should use schema_commitlog.	2023-03-14 19:43:51 +04:00
Petr Gusev	3ef201d67a	schema.hh: use schema_static_props for wait_for_sync_to_commitlog This patch continues the refactoring, now we move wait_for_sync_to_commitlog property from schema_builder to schema_static_props. The patch replaces schema_builder::set_wait_for_sync_to_commitlog and is_extra_durable with two register_static_configurator, one in system_keyspace and another in system_distributed_keyspace. They correspond to the two parts of the original disjunction in schema_tables::is_extra_durable.	2023-03-14 19:26:05 +04:00
Calle Wilund	4681c4b572	configurables: Add optional service lookup to init callback Simplified, more direct version of "dependency injection". I.e. caller/initiator (main/cql_test_env) provides a set of services it will eventually start. Configurable can remember these. And use, at least after "start" notification. Closes #13037	2023-03-14 17:13:52 +02:00
Petr Gusev	349bc1a9b6	schema.hh: introduce schema_static_props, use it for null_sharder Our goal (#12642) is to mark raft tables to use schema commitlog. There are two similar cases in code right now - with_null_sharder and set_wait_for_sync_to_commitlog schema_builder methods. The problem is that if we need to mark some new schema with one of these methods we need to do this twice - first in a method describing the schema (e.g. system_keyspace::raft()) and second in the function create_table_from_mutations, which is not obvious and easy to forget. create_table_from_mutations is called when schema object is reconstructed from mutations, with_null_sharder and set_wait_for_sync_to_commitlog must be called from it since the schema properties they describe are not included in the mutation representation of the schema. This patch proposes to distinguish between the schema properties that get into mutations and those that do not. The former are described with schema_builder, while for the latter we introduce schema_static_props struct and the schema_builder::register_static_configurator method. This way we can formulate a rule once in the code about which schemas should have a null sharder, and it will be enforced in all cases.	2023-03-14 18:29:34 +04:00
Wojciech Mitros	52eb70aef0	docs: make wasm documentation visible for users Until now, the instructions on generating wasm files and using them for Scylla UDFs were stored in docs/dev, so they were not visible on the docs website. Now that the Rust helper library for UDFs is ready, and we're inviting users to try it out, we should also make the rest of the Wasm UDF documentation readily available for the users. Closes #13139	2023-03-14 16:21:23 +02:00
David Garcia	63ad5607ee	docs: Update custom styles	2023-03-14 12:06:20 +00:00
David Garcia	bad914a34d	docs: Update styles	2023-03-14 12:01:33 +00:00
David Garcia	8c4659a379	docs: Add card logos	2023-03-14 10:37:23 +00:00
Botond Dénes	1d9b7f3a92	Merge 'cmake: sync with `configure.py` (11/n)' from Kefu Chai - build: cmake: remove test which does not exist yet - build: cmake: document add_scylla_test() - build: cmake: extract index, repair and data_dictionary out - build: cmake: extract scylla-main out - build: cmake: find Snappy before using it - build: cmake: add missing linkages - build: cmake: add missing sources to test-lib - build: cmake: link sstables against libdeflate - build: cmake: link Boost::regex against ICU::uc Closes #13110 * github.com:scylladb/scylladb: build: cmake: link Boost::regex against ICU::uc build: cmake: link sstables against libdeflate build: cmake: add missing sources to test-lib build: cmake: add missing linkages build: cmake: find Snappy before using it build: cmake: extract scylla-main out build: cmake: extract index, repair and data_dictionary out build: cmake: document add_scylla_test() build: cmake: remove test which does not exist yet	2023-03-14 11:45:48 +02:00
Petr Gusev	00fc73d966	database.cc: drop ensure_populated and mark_as_populated There was some logic to call mark_as_populate at the appropriate places, but the _populated field and the ensure_populated function were not used by anyone.	2023-03-14 13:32:25 +04:00
Botond Dénes	e22b27a107	Merge 'Improve database shutdown verbosity' from Pavel Emelyanov The `database::stop` method is sometimes hanging and it's always hard to spot where exactly it sleeps. Few more logging messages would make this much simpler. refs: #13100 refs: #10941 Closes #13141 * github.com:scylladb/scylladb: database: Increase verbosity of database::stop() method large_data_handler: Increase verbosity on shutdown large_data_handler: Coroutinize .stop() method	2023-03-14 10:55:31 +02:00
Kefu Chai	5842804591	install-dependencies: extract go_arch() out for defining the mapping from the output of `arch` to the corresponding GO_ARCH. see `b94dc384ca/src/go/build/syslist.go (L55)` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13151	2023-03-14 10:05:09 +03:00
Raphael S. Carvalho	0c77f77659	sstables: sstable_directory: Load SSTable at the shard that actually own it Today, the SSTable generation provides a hint on which shard owns a particular SSTable. That hint determines which shard will load the SSTable into memory. With upcoming UUID generation, we will no longer have this hint embedded into the SSTable generation, meaning that SSTables will be loaded at random shards. This is not good because shards will have to reference memory from other shards to access the SSTable metadata that was allocated elsewhere. This patch changes sstable_directory to: 1) Use generation value to only determine which shard will calculate the owner shards for SSTables. Essentially works like a round-robin distribution. 2) The shard assigned to compute the owners for a SSTable will do so reading the minimum from disk, usually only Scylla file is needed. 3) Once that shard finished computing the owners, it will forward the SSTable to the shard that own it. 4) Shards will later load SSTables locally that were forwarded to them. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-13 15:40:43 -03:00
Raphael S. Carvalho	2c4e141314	sstables: sstable_directory: Give sstable_info_vector a more descriptive name Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-13 15:40:43 -03:00
Raphael S. Carvalho	a83328c358	sstables: Allow owner shards to be computed for a partially loaded SSTable Today, owner shards can only be computed for a fully loaded SSTable. For upcoming changes in the SSTable loader, we want to load the minimum from disk to be able to compute the set of shards owning the SSTable. If sharding metadata is available, it means we only need to read TOC and Scylla components. Otherwise, Summary must be read to provide first and last keys for compute_shards_for_this_sstable() to operate on them instead. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-13 15:40:43 -03:00
Raphael S. Carvalho	b49ae56e70	sstables: Move SSTable loading to sstable_directory::sort_sstable() The reason for this change is that we'll want to fully load the SSTable only at the destination shard. Later, sort_sstable() will calculate set of owner shards for a SSTable by only loading scylla metadata file. If it turns out that the SSTable belongs to current shard, then we'll fully load the SSTable using the new and fresh sstable_directory::load_sstable(). Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-13 15:40:43 -03:00
Raphael S. Carvalho	229d89dbde	sstables: Move sstable_directory::sort_sstable() to private interface Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-13 15:40:43 -03:00
Raphael S. Carvalho	36602d1025	sstables: Restore indentation in sstable_directory::sort_sstable() Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-13 15:40:43 -03:00
Raphael S. Carvalho	825f23b7f9	sstables: Coroutinize sstable_directory::sort_sstable() Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-13 15:40:43 -03:00
Raphael S. Carvalho	a19a9f5d99	sstables: sstable_directory: Extract sstable loading from process_descriptor() Will make it easier for process_descriptor to process the SSTable without having to fully load the SSTable. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-13 15:40:43 -03:00
Raphael S. Carvalho	08e6df256e	sstables: sstable_directory: Separate private fields from methods Following the expected coding convention. It's also somewhat disturbing to see them mixed up. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-13 15:40:43 -03:00
Raphael S. Carvalho	7d751991c1	sstables: Coroutinize sstable_directory::process_descriptor Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-13 15:40:43 -03:00
Anna Stuchlik	8ceb8b0240	doc: add a Knowledge Base article about consitency, v2 of https://github.com/scylladb/scylladb/pull/12929 Closes #12957	2023-03-13 17:48:25 +02:00
Aleksandra Martyniuk	cb0e6d617a	test: extend test_compaction_task.py to test cleanup compaction	2023-03-13 16:36:20 +01:00
Aleksandra Martyniuk	27b999808f	compaction: create task manager's task for cleanup keyspace compaction on one shard Implementation of task_manager's task that covers cleanup keyspace compaction on one shard.	2023-03-13 16:35:39 +01:00
Aleksandra Martyniuk	7dd27205f6	compaction: create task manager's task for cleanup keyspace compaction Implementation of task_manager's task covering cleanup keyspace compaction that can be started through storage_service api.	2023-03-13 16:35:39 +01:00
Aleksandra Martyniuk	4a5752d0d0	api: add get_table_ids to get table ids from table infos	2023-03-13 16:35:39 +01:00
Aleksandra Martyniuk	8801f326c6	compaction: create cleanup_compaction_task_impl	2023-03-13 16:35:39 +01:00
Aleksandra Martyniuk	a976e2e05b	repair: fix indentation	2023-03-13 15:25:53 +01:00
Aleksandra Martyniuk	41abc87d28	repair: continue user requested repair if no_such_column_family is thrown When one of column families requested for repair does not exist, we should repair all other requested column families. no_such_column_family exception is caught and logged, and repair continues.	2023-03-13 15:25:52 +01:00
Aleksandra Martyniuk	2376a434b6	repair: add find_column_family_if_exists function	2023-03-13 15:25:15 +01:00
Botond Dénes	3f0b3489a2	reader_concurrency_semaphore: handle reader blocked on memory becoming inactive Kill said read's memory requests with std::bad_alloc and dequeue it from the memory wait list, then evict it on the spot. Now that `_inactive_reads` just store permits, we can do this easily.	2023-03-13 08:07:53 -04:00
Botond Dénes	4f5657422d	reader_concurrency_semaphore: move _permit_list next to the other lists A mostly cosmetic change. Also add a comment mentioning that this is the catch-all list.	2023-03-13 08:07:53 -04:00
Botond Dénes	d1bc5f9293	reader_permit: evict inactive read on timeout If the read is inactive when the timeout clock fires, evict it. Now that `_inactive_reads` just store permits, we can do this easily.	2023-03-13 08:07:53 -04:00
Botond Dénes	6181c08191	reader_concurrency_semaphore: move inactive_read to .cc It is not used in the header anymore and moving it to the .cc allows us to remove the dependency on flat_mutation_reader_v2.hh.	2023-03-13 08:07:53 -04:00
Botond Dénes	e56ec9373d	reader_concurrency_semaphore: store permits in _inactive_reads Add an member of type `inactive_read` to reader permit, and store permit instances in `_inactive_reads`. This list is now just another intrusive list the permit can be linked into, depending on its state. Inactive read handles now just store a reader permit pointer.	2023-03-13 08:07:53 -04:00
Botond Dénes	d11f9efbfe	reader_concurrency_semaphore: inactive_read: de-inline more methods They will soon need to access reader_permit::impl internals, only available in the .cc file.	2023-03-13 08:07:53 -04:00
Botond Dénes	8e296e8e05	reader_concurrency_semaphore: make _ready_list intrusive Following the same scheme we used to make the wait lists intrusive. Permits are added to the ready list intrusive list while waiting to be executed and moved back to the _permit_list when de-queued from this list. We now use a conditional variable for signaling when there are permits ready to be executed.	2023-03-13 08:07:53 -04:00
Nadav Har'El	c41b2d35ed	test/alternator: test concurrent TagResource / UntagResource This patch adds an Alternator test reproducing issue #6389 - that concurrent TagResource and/or UntagResource operations was broken and some of the concurrent modifications were lost. The test has two threads, one loops adds and removes a tag A, the other adds and removes a tag B. After we add tag A, we expect tag A to be there - but due to issue #6389 this modification was sometimes lost when it raced with an operation on B. This test consistently failed before issue #6389 was fixed, and passes now after the issue was fixed by the previous patches. The bug reproduces by chance, so it requires a fairly long loop (a few seconds) to be sure it reproduces - so is marked a "veryslow" test and will not run in CI, but can be used to manually reproduce this issue with: test/alternator/run --runveryslow test_tag.py::test_concurrent_tag Refs #6389. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-03-13 13:38:15 +02:00
Nadav Har'El	87f29d8fd2	db/tags: drop unsafe update_tags() utility function The previous patches introduced the function modify_tags() as a safe version of update_tags(), and switched all uses of update_tags() to use modify_tags(). So now that the unsafe update_tags() is no longer use, we can drop it. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-03-13 13:35:17 +02:00
Kamil Braun	228856f577	Merge 'Test changing IP address of 2 nodes in a cluster out of 3 & misc cleanups' from Konstantin Osipov Closes #13135 * github.com:scylladb/scylladb: test: improve logging in ScyllaCluster raft: (test) test ip address change	2023-03-13 11:47:00 +01:00
Calle Wilund	dba45f3dc8	init: Add life cycle notifications to configurables Allows a configurable to subscribe to life cycle notifications for scylla app. I.e. do stuff on start/stop. Also allow configurables in cql_test_env v2: * Fix camel casing * Make callbacks future<> (should have been. mismerge?) Closes #13035	2023-03-13 12:45:20 +02:00
Nadav Har'El	c196bd78de	alternator: isolate concurrent modification to tags Alternator modifies tags in three operations - TagResource, UntagResource and UpdateTimeToLive (the latter uses a tag to store the TTL configuration). All three operations were implemented by three separate steps: 1. Read the current tags. 2. Modify the tags according to the desired operation. 3. Write the modified tags back with update_tags(). This implementation was not safe for concurrent operations - some modifications may be be lost. We fix this in this patch by using the new modify_tags() function introduced in the previous patch, which performs all three steps under one lock so the tag operations are serialized and correctly isolated. Fixes #6389 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-03-13 12:25:03 +02:00
Nadav Har'El	fbdf52acf6	db/tags: add safe modify_tags() utility functions The existing utility function update_tags() for modifying tags in a schema (used mainly by Alternator) is not safe for concurrent operations: The function first reads the old tags, then modifies them and writes them back. If two such calls happen concurrently, both calls may read the same old tags, make different modifications, and then both write the new tags, with one's write overwriting the other's. So in this patch, we introduce a new utility function, modify_tags(), to provide a concurrency-safe read-modify-write operation on tags. The new function takes a modification function and calls the read, modify and write steps together under a single lock. The new function also takes a table name instead of a schema object - because we need to read the schema under the lock, because might have already been changed by some other concurrent operation. This patch only introduces the new function, it doesn't change any code to use it yet, and doesn't remove the unsafe update_tags() function. We'll do those things in the next patches. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-03-13 11:51:01 +02:00
Nadav Har'El	e5e9b59518	migration_manager: expose access to storage_proxy A migration_manager holds a reference to a storage_proxy, and uses it internally a lot - e.g., to gain access to the data_dictionary. Users of migration_manager might also benefit from this storage_proxy - we will see such a case in the next patches. So let's provide a getter for the storage_proxy. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-03-13 11:43:53 +02:00
Israel Fruchter	ef229a5d23	Repackaging cqlsh cqlsh is moving into it's own repository: https://github.com/scylladb/scylla-cqlsh * add cqlsh as submodule * update scylla-java-tools to have cqlsh remove * introduced new cqlsh artifcat (rpm/deb/tar) Depends: https://github.com/scylladb/scylla-tools-java/pull/316 Ref: scylladb/scylladb#11569 Closes #11937 [avi: restore tools/java submodule location, adjust commit]	2023-03-12 20:22:33 +02:00
Pavel Emelyanov	0cd3a6993b	sstables: Don't rely on lexicographical prefix comparison When creating a deletion log for a bunch of sstables the code checks that all sstables share the same "storage" by lexicographically comparing their prefixes. That's not correct, as filesystem paths may refer to the same directory even if not being equal. So far that's been mostly OK, because paths manipulations were done in simple forms without producing unequal paths. Patch `8a061bd8` (sstables, code: Introduce and use change_state() call) triggerred a corner case. fs::path foo("/foo"); sstring sub(""); foo = foo / sub; produces a correct path of "/foo/", but the trailing slash breaks the aforementioned assumption about prefixes comparison. As a result, when an sstable moves between, say, staging and normal locations it may gain a trailing slash breaking the deletion log creation code. The fix is to restrict the deletion log creation not to rely on path strings comparison completely and trim the trailing slash if it happens. A test is included. fixes: #13085 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13090	2023-03-12 20:06:47 +02:00
Avi Kivity	beaa5a9117	Merge 'wasm: move compilation to an alien thread' from Wojciech Mitros The compilation of wasm UDFs is performed by a call to a foreign function, which cannot be divided with yielding points and, as a result, causes long reactor stalls for big UDFs. We avoid them by submitting the compilation task to a non-seastar std::thread, and retrieving the result using seastar::alien. The thread is created at the start of the program. It executes tasks from a queue in an infinite loop. All seastar shards reference the thread through a std::shared_ptr to a `alien_thread_runner`. Considering that the compilation takes a long time anyway, the alien_thread_runner is implemented with focus on simplicity more than on performance. The tasks are stored in an std::queue, reading and writing to it is synchronized using an std::mutex for reading/ writing to the queue, and an std::condition_variable waiting until the queue has elements. When the destructor of the alien runner is called, an std::nullopt sentinel is pushed to the queue, and after all remaining tasks are finished and the sentinel is read, the thread finishes. Fixes #12904 Closes #13051 * github.com:scylladb/scylladb: wasm: move compilation to an alien thread wasm: convert compilation to a future	2023-03-12 19:29:11 +02:00
Avi Kivity	24719ea639	Merge 'sstables: sstable_directory: avoid unnecessarily constructing tuple<> from pair<>' from Kefu Chai - sstables: sstable_directory: avoid unnecessarily constructing tuple<> from pair<> - sstables: sstable_directory: add type constraints Closes #13144 * github.com:scylladb/scylladb: sstables: sstable_directory: add type constraints sstables: sstable_directory: avoid unnecessarily constructing tuple<> from pair<>	2023-03-12 19:10:02 +02:00
Pavel Emelyanov	24e943f79b	install-dependencies: Add minio server and client These two are static binaries, so no need in yum/apt-installing them with dependencies. Just download with curl and put them into /urs/local/bin with X-bit set. This is needed for future object-storage work in order to run unit tests against minio. refs: #12523 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> [avi: regenerate frozen toolchain] Closes #13064 Closes #13099	2023-03-12 19:07:10 +02:00
Marcin Maliszkiewicz	74cc90a583	main: remove unused bpo::store	2023-03-12 16:59:27 +02:00
Nadav Har'El	e72b85e82c	Merge 'cql-pytest/lwt_test: test LWT UPDATE when partition/clustering ranges are empty' from Jan Ciołek Adds two test cases which test what happens when we perform an LWT UPDATE, but the partition/clustering key has 0 possible values. This can happen e.g when a column is supposed to be equal to two different values (`c = 0 AND c = 1`). Empty partition ranges work properly, empty clustering range currently causes a crash (#13129). I added tests for both of these cases. Closes #13130 * github.com:scylladb/scylladb: cql-pytest/test_lwt: test LWT update with empty clustering range cql-pytest/test_lwt: test LWT update with empty partition range	2023-03-12 15:11:33 +02:00
Nadav Har'El	53c8c43d8a	Merge 'cql3: improve support for C-style parenthesis casts' from Jan Ciołek CQL supports type casting using C-style casts. For example it's possible to do: `blob_column = (blob)funcReturningInt()` This functionality is pretty limited, we only allow such casts between types that have a compatible binary representation. Compatible means that the bytes will stay unchanged after the conversion. This means that it's legal to cast an int to blob (int is just a 4 byte blob), but it's illegal to cast a bigint to int (change 4 bytes -> 8 bytes). This simplifies things, to cast we can just reinterpret the value as the other type. Another use of C-style casts are type hints. Sometimes it's impossible to infer the exact type of an expression from the context. In such cases the type can be specified by casting the expression to this type. For example: `overloadedFunction((int)?)` Without the cast it would be impossible to guess what should be the bind marker's type. The function is overloaded, so there are many possible argument types. The type hint specifies that the bind marker has type int. An interesting thing is that such casts don't have to be explicit. CQL allows to put an int value in a place where a blob value is expected and it will be automatically converted without any explicit casting. --- I started looking at our implementation of casts because of #12900. In there the author expressed the need to specify a type hint for bind marker used to pass the WASM code. It could be either `(text)?` for text WASM, or `(blob)?` for binary WASM. This specific use of type hints wasn't supported because there was no `receiver` and the implementation of `prepare_expression` didn't handle that. Preparing casts without a receiver should be easy to implement - we can infer the type of the expression by looking at the type to which the expression is cast. But while reading `prepare_expression` for `expr::cast` I noticed that the code there is a bit strange. The implementation prepared the expression to cast using the original `receiver` instead of a receiver with the cast type. This caused some issues because of which casting didn't work as expected. For example it was possible to do: ```cql blob_column = (blob)funcReturningInt() ``` But this didn't work at all: ```cql blob_column = (blob)(int)12323 ``` It tried to prepare `untyped_contant(12323)` with a `blob` receiver, which fails. This makes `expr::cast` useless for casting. Casting when the representation is compatible is already implicit. I couldn't find a single case where adding a cast would change the behavior in any way. There was some use for it as a type hint to choose a specific overload of a function, but it was worthless for casting. Cassandra has the same issue, I created a `cql-pytest` test and it showed that we behave in the same way as Cassandra does. I decided to improve this. By preparing the expression using a receiver with the cast type, `expr::cast` becomes actually useful for casting values. Things like `(blob)(int)12323` now work without any issues. This diverges from the behavior in Cassandra, but it's an extension, not a breaking incompatibility. --- This PR improves `prepare_expression` for `expr::cast` in the following ways: 1) Support for more complex casts by preparing the expression using a different receiver. This makes casts like `(blob)(int)123` possible 2) Support preparing `expr::cast` without a receiver. Type inference chooses the cast type as the type of the expression. 3) Add pytest tests for C-style casts `2)` Is needed for #12900, the other changes is just something I decided to do since I was already working on this piece of code. Closes #13053 * github.com:scylladb/scylladb: expr_test: more tests for preparing bind variables with type hints prepare_expr: implement preparing expr::cast with no receiver prepare_expr: use :user formatting in cast_prepare_expression prepare_expr: remove std::get<> in cast_prepare_expression prepare_expr: improve cast_prepare_expression prepare_expr: improve readability in cast_prepare_expression cql-pytest: test expr::cast in test_cast.py	2023-03-12 15:07:54 +02:00
Nadav Har'El	843a5dfc15	Merge 'Allow setting permissions for user-defined functions' from Wojciech Mitros This series aims to allow users to set permissions on user-defined functions. The implementation is based on Cassandra's documentation and should be fully compatible: https://cassandra.apache.org/doc/latest/cassandra/cql/security.html#cql-permissions Fixes: #5572 Fixes: #10633 Closes #12869 * github.com:scylladb/scylladb: cql3: allow UDTs in permissions on UDFs cql3: add type_parser::parse() method taking user_types_metadata schema_change_test: stop using non-existent keyspace cql3: fix parameter names in function resource constructors cql3: handle complex types as when decoding function permissions cql3: enforce permissions for ALTER FUNCTION cql-pytest: add a (failing) test case for UDT in UDF cql-pytest: add a test case for user-defined aggregate permissions cql-pytest: add tests for function permissions cql3: enforce permissions on function calls selection: add a getter for used functions abstract_function_selector: expose underlying function cql3: enforce permissions on DROP FUNCTION cql3: enforce permissions for CREATE FUNCTION client_state: add functions for checking function permissions cql-pytest: add a case for serializing function permissions cql3: allow specifying function permissions in CQL auth: add functions_resource to resources	2023-03-12 14:04:34 +02:00
Avi Kivity	7f9c822346	Merge 'Coroutinize distributed_loader's reshape() function' from Pavel Emelyanov It was suggested as candidate from one of previous reviews, so here it is. Closes #13140 * github.com:scylladb/scylladb: distributed_loader: Indentation fix after previous patch distributed_loader: Coroutinize reshape() helper	2023-03-12 12:21:33 +02:00
Nadav Har'El	1379d8330f	Merge 'Teach sstables tests not to use tempdir explicitly' from Pavel Emelyanov Many sstable test cases create tempdir on their own to create sstables with. Sometimes it's justified when the test needs to check files on disk by hand for some validation, but often all checks are fs-agnostic. The latter case(s) can be patched to work on top of any storage, in particular -- on top of object storage. To make it work tests should stop creating sstables explicitly in tempdir and this PR does exactly that. All relevant occurrences of tempdir are removed from test cases, instead the sstable::test_env's tempdir is used. Next, the test_env::{create_sstable\|reusable_sst} are patched not to accept the `fs::path dir` argument and pick the env's tempdir. Finally, the `make_sstable_easy` helper is patched to use path-less env methods too. refs: #13015 Closes #13116 * github.com:scylladb/scylladb: test,sstables: Remove path from make_sstable_easy() test,lib: Remove wrapper over reusable_sst and move the comment test: Make "compact" test case use env dir test,compaction: Use env tempdir in some more cases test,compaction: Make check_compacted_sstables() use env's dir test: Relax making sstable with sequential generation test/sstable::test_env: Keep track of auto-incrementing generation test/lib: Add sstable maker helper without factory test: Remove last occurrence of test_env::do_with(rval, ...) test,sstables: Dont mess with tempdir where possible test/sstable::test_env: Add dir-less sstables making helpers test,sstables: Use sstables::test_env's tempdir with sweeper test,sstables: Use sstables::test_env's tempdir test/lib: Add tempdir sweeper test/lib: Open-code make_sstabl_easy into make_sstable test: Remove vector of mutation interposer from test_key_count_estimation	2023-03-12 10:14:26 +02:00
Kefu Chai	97e411bc96	sstables: sstable_directory: add type constraints add type constraits for `sstable_directory::parallel_for_each_restricted()`, to enforce the constraints on the function so it should be invocable with the argument of specified type. this helps to prevent the problems of passing function which accepts `pair<key, value>` or `tuple<key, value>`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-11 17:47:19 +08:00
Kefu Chai	0a29d62f4f	sstables: sstable_directory: avoid unnecessarily constructing tuple<> from pair<> `parallel_for_each_restricted()` maps the elements in the given container with the specified function. in this case, the elements is of type `unordered_map::value_type`, which is a `pair<const Key, Value>`. to convert it to a `tuple<Key, Value>`, the constructor of the tuple is called. but what we intend to do here is but to access the second element in the `pair<>` here. in this change, the function's signature is changed to match `scan_descriptors_map::value_type` to avoid the unnecessary overhead of constructor of `tuple<>`. also, because the underlying `max_concurrent_for_each()` does not pass a xvalue to the given func, instead, it just pass `*s.begin` to the function, where `s.begin` is an `Iterator` returned by `std::begin(container)`. so let's just use a plain reference as the parameter type for the function. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-11 17:47:19 +08:00
Konstantin Osipov	7309a1bd6b	test: improve logging in ScyllaCluster Print IP addresses and cluster identifiers in more log messages, it helps debugging.	2023-03-10 19:53:19 +03:00
Konstantin Osipov	4ace19928d	raft: (test) test ip address change	2023-03-10 19:52:40 +03:00
Pavel Emelyanov	f84f0a9414	database: Increase verbosity of database::stop() method Add logging messages when stopping (this way or another) various sub-services and helper objects Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-10 19:45:23 +03:00
Pavel Emelyanov	2f316880ae	large_data_handler: Increase verbosity on shutdown It may hang waiting for background handlers, so it's good to know if they exist at all Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-10 19:45:18 +03:00
Alejo Sanchez	e35762241a	api: gossiper: fix alive nodes Fix API call to wait for all shards to reach the current shard 0 gossiper version. Throws when timeout is reached. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-03-10 17:29:11 +01:00
Alejo Sanchez	6c04476561	gms, service: lock live endpoint copy To allow concurrent execution, protect copy of live endpoints with a semaphore.	2023-03-10 17:16:21 +01:00
Pavel Emelyanov	2000494881	large_data_handler: Coroutinize .stop() method Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-10 19:06:14 +03:00
Pavel Emelyanov	e7250e5a3f	Merge 'sstables: add more constness' from Kefu Chai - sstables: mark param of sstable::_from_sstring() const - sstables: mark param of reverse_map() const - sstables: mark static lookup table const Closes #13115 github.com:scylladb/scylladb: sstables: mark static lookup table const sstables: mark param of reverse_map() const sstables: mark param of sstable::*_from_sstring() const	2023-03-10 17:14:56 +03:00
Kamil Braun	51a76e6359	Revert "Merge 'sstables: remove unused function add more constness' from Kefu Chai" This reverts commit `49e0d0402d`, reversing changes made to `25cf325674`. An old version of PR #13115 was accidentally merged into `master` (it was dequeued concurrently while a running next promotion job included it). Revert the merge. We'll merge the new version as a follow-up.	2023-03-10 15:02:28 +01:00
Aleksandra Martyniuk	4808220729	test: extend test_compaction_task.py test/rest_api/test_compaction_task.py is extended so that it checks validity of major compaction run from column family api.	2023-03-10 15:01:22 +01:00
Aleksandra Martyniuk	0918529fdf	api: unify major compaction Major compaction can be started from both storage_service and column_family api. The first allows to compact a subset of tables in given keyspace, while the latter - given table in given keyspace. As major compaction started from storage_service has a wider scope, we use its mechanisms for column_family's one. That makes it more consistent and reduces number of classes that would be needed to cover the major compaction with task manager's tasks.	2023-03-10 15:01:22 +01:00
Pavel Emelyanov	537510f7d2	scylla-gdb: Parse and eval _all_threads without quotes I've no idea why the quotes are there at all, it works even without them. However, with quotes gdb-13 fails to find the _all_threads static thread-local variable _unless_ it's printed with gdb "p" command beforehand. fixes: #13125 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13132	2023-03-10 15:01:22 +01:00
Pavel Emelyanov	b07570406e	distributed_loader: Indentation fix after previous patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-10 16:01:09 +03:00
Pavel Emelyanov	f90ea6efc2	distributed_loader: Coroutinize reshape() helper Drop do_with(), keep the needed variable on stack. Replace repeat() with plain loop + yield. Keep track of run_custom_job()'s exception. Indentation is deliberately left broken. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-10 15:37:57 +03:00
Wojciech Mitros	6b8c1823a3	cql3: allow UDTs in permissions on UDFs Currently, when preparing an authorization statement on a specific function, we're trying to "prepare" all cql types that appear in the function signature while parsing the statement. We cannot do that for UDTs, because we don't know the UDTs that are present in the databse at parsing time. As a result, such authorization statements fail. To work around this problem, we postpone the "preparation" of cql types until the actual statement validation and execution time. Until then, we store all type strings in the resource object. The "preparation" happens in the `maybe_correct_resource` method, which is called before every `execute` during a `check_access` call. At that point, we have access to the `query_processor`, and as a result, to `user_types_metadata` which allows us to prepare the argument types even for UDTs.	2023-03-10 11:02:33 +01:00
Wojciech Mitros	4f0b3539c5	cql3: add type_parser::parse() method taking user_types_metadata In a future patch, we don't have access to a `user_types_storage` while we want to parse a type, but we do have access to a `user_types_metadata`, which is enough to parse the type. We add a variant of the `type_parser::parse()` that takes a `user_types_metadata` instead of a `user_types_storage` to be able to parse a type also in the described context.	2023-03-10 11:02:33 +01:00
Wojciech Mitros	4182a221d6	schema_change_test: stop using non-existent keyspace The current implementation of CQL type parsing worked even when given a string representing a non-existent keyspace, as long as the parsed type was one of the "native" types. This implementation is going to change, so that we won't parse types given an incorrect keyspace name. When using `do_with_cql_env`, a "ks" keyspace is created by default, and "tests" keyspace is not. The tests for reverse schemas in `schema_change_test` were using the "tests" keyspace, so in order to make the tests work after the future changes, they now use the existing "ks" keyspace.	2023-03-10 11:02:32 +01:00
Wojciech Mitros	b93c7b94eb	cql3: fix parameter names in function resource constructors In some places, the parameter name used when constructing a resource object was 'function_name', while the actual argument was the signature of a function, which is particularly confusing, because function names also appear frequently in these contexts. This patch changes the identifiers to more accurately reflect, what they represent.	2023-03-10 11:02:32 +01:00
Wojciech Mitros	9a303fd99c	cql3: handle complex types as when decoding function permissions Currently, we're parsing types that appear in a function resource using abstract_type::parse_type, which only works with simple types. This patch changes it to db::marshal::type_parser::parse, which can also handle collections. We also adjust the test_grant_revoke_udf_permissions test so that it uses both simple and complex types as parameters of the function that we're granting/revoking permissions on.	2023-03-10 11:02:32 +01:00
Wojciech Mitros	438c7fdfa7	cql3: enforce permissions for ALTER FUNCTION Currently, the ALTER permission is only enforced on ALL FUNCTIONS or on ALL FUNCTIONS IN KEYSPACE. This patch enforces the permisson also on a specific function.	2023-03-10 11:02:32 +01:00
Piotr Sarna	c4e6925bb6	cql-pytest: add a (failing) test case for UDT in UDF Our permissions system is currently incapable of figuring out user-defined type definitions when preparing functions permissions. This test case creates such a function, and it passes on Cassandra.	2023-03-10 11:02:32 +01:00
Piotr Sarna	63e67c9749	cql-pytest: add a test case for user-defined aggregate permissions This test case is similar to the one for user-defined functions, but checks if aggregate permissions are enforced.	2023-03-10 11:02:32 +01:00
Piotr Sarna	6deebab786	cql-pytest: add tests for function permissions The test case checks that function permissions are enforced for non-superuser users.	2023-03-10 11:01:48 +01:00
Kefu Chai	77643717db	sstables: mark static lookup table const these tables are mappings from symbolic names to their string representation. we don't mutate them. so mark them const. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-10 16:18:29 +08:00
Kefu Chai	0889643243	sstables: mark param of reverse_map() const it does not mutate the map in which the value is looked up, so let's mark map const. also, take this opportunity to use structured binding for better readability. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-10 16:18:29 +08:00
Kefu Chai	9eae97c525	sstables: mark param of sstable::*_from_sstring() const neither of the changed function mutates the parameter. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-10 16:18:28 +08:00
Pavel Emelyanov	e3dc60286c	sstable: Remove unused friendship The components_writer class from this list doesn't even exist Also drop the forward declaration of mx::partition_reversing_data_source_impl Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13097	2023-03-10 07:13:18 +02:00
Jan Ciolek	c11f7a9e35	expr_test: more tests for preparing bind variables with type hints Add tests for preparing expr::cast which contains a bind variable, with a known receiver. expr::cast serves as a type hint for the bind variable. It specifies what should be the type of the bind variable, we must check that this type is compatible with the receiver and fail in case it isn't The following cases are tested: Valid: `text_col = (text)?` `int_col = (int)?` Invalid: `text_col = (int)?` `int_col = (text)?` Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-03-09 18:31:45 +01:00
Jan Ciolek	a08eb5cb76	prepare_expr: implement preparing expr::cast with no receiver Type inference in cast_prepare_expression was very limited. Without a receiver it just gave up and said that it can't infer the type. It's possible to infer the type - an expression that casts something to type bigint also has type bigint. This can be implemented by creating a fake receiver when the caller didn't specify one. Type of this fake receiver will be c.type and c.arg will be prepared using this receiver. Note that the previous change (changing receiver to cast_type_receiver in prepare_expression) is required to keep the behaviour consistent. Without it we would sometimes prepare c.arg using the original receiver, and sometimes using a receiver with type c.type. Currently it's impossible to test this change on live code. Every place that uses expr::cast specifies a receiver. A unit test is all that can be done at the moment to ensure correctness. In the future this functionality will be used in UDFs. In https://github.com/scylladb/scylladb/pull/12900 it was requested to be able to use a type hint to specify whether WASM code of the function will be sent in binary or text form. The user can convey this by typing either `(blob)?` or `(text)?`. In this case there will be no receiver and type inference would fail. After this change it will work - it's now possible to prepare either of those and get an expression with a known type. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-03-09 18:31:45 +01:00
Jan Ciolek	9f8340d211	prepare_expr: use :user formatting in cast_prepare_expression By default expressions are printed using the {:debug} formatting, wich is intended for internal use. Error messages should use the {:user} formatting instead. cast_prepare_expression uses the default formatting in a few places that are user facing, so let's change it to use {:user} formatting. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-03-09 18:31:45 +01:00
Jan Ciolek	12560b5745	prepare_expr: remove std::get<> in cast_prepare_expression A few times throughout cast_prepare_expression there's a line which uses std::get<> to get the raw type of the cast. `std::get<shared_ptr<cql3_type::raw>>(c.type)` This is a dangerous thing to do. It might turn out that the variant holds a different alternative and then it'll start throwing bad_variant_access. In this case this would happen if someone called cast_prepare_expression on an expression that is already prepared. It's possible to modify the code in a way that avoids doing the std::get altogether. It makes the code more resilient and gives me a piece of mind. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-03-09 18:31:45 +01:00
Jan Ciolek	7c384de476	prepare_expr: improve cast_prepare_expression Preparing expr::cast had some artificial limitations. Things like this worked: `blob_col = (blob)funcReturnsInt()` But this didn't: `blob_col = (blob)(int)1234` This is caused by the line: `prepare_expression(c.arg, db, keyspace, schema_opt, receiver)` Here the code prepares the expression to be cast using the original receiver which was passed to cast_prepare_expression. In the example above this meant that it tried to prepare untyped_constant(1234) using a receiver with type blob. This failed because an integer literal is invalid for a blob column. To me it looks like a mistake. What it should do instead is prepare the int literal using the type (int) and then see if int can be cast to blob, by checking if these types have compatible binary representation. This can be achieved by using `cast_type_receiver` instead of `receiver`. Making this small change makes it possible to use the cast in many situations where it was previously impossible. The tests have to be updated to reflect the change, some of them ow deviate from Cassandra, so they have to be marked scylla_only. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-03-09 18:31:41 +01:00
Piotr Sarna	62458b8e4f	cql3: enforce permissions on function calls Only users with EXECUTE permission are able to use the function in SELECT statements.	2023-03-09 17:51:17 +01:00
Piotr Sarna	4624934032	selection: add a getter for used functions The function allows extracting used function definitions from given selection. Thanks to that, it will be possible to verify if the callee has proper permissions to execute given functions.	2023-03-09 17:51:17 +01:00
Piotr Sarna	d95912c369	abstract_function_selector: expose underlying function It will be needed later in order to check this function's permissions.	2023-03-09 17:51:17 +01:00
Piotr Sarna	488934e528	cql3: enforce permissions on DROP FUNCTION Only users with DROP permission are allowed to drop user-defined functions.	2023-03-09 17:51:15 +01:00
Piotr Sarna	e8afcf7796	cql3: enforce permissions for CREATE FUNCTION Only users with CREATE permissions are allowed to create user-defined functions.	2023-03-09 17:50:56 +01:00
Piotr Sarna	d10799a834	client_state: add functions for checking function permissions The helper functions will be later used to enforce permissions for user-defined functions.	2023-03-09 17:50:56 +01:00
Piotr Sarna	8de1017691	cql-pytest: add a case for serializing function permissions This test case checks that granting function permissions result in correct serialization of the permissions - so that reading system_auth.role_permissions and listing the permissions via CQL with `LIST permission OF role` works in a compatible way with both Scylla and Cassandra.	2023-03-09 17:50:56 +01:00
Piotr Sarna	aa4c15a44a	cql3: allow specifying function permissions in CQL This commit allows users to specify the following resources: - ALL FUNCTIONS - ALL FUNCTIONS IN KEYSPACE ks - FUNCTION f(int, double) The permissions set for these resources are not enforced yet.	2023-03-09 17:50:56 +01:00
Piotr Sarna	5b662dd447	auth: add functions_resource to resources This commit adds "functions" resource to our authorization resources. The implementation strives to be compatible with Cassandra both from CQL level and serialization, i.e. so that entries in system_auth.role_permissions table will be identical if CassandraAuthorizer is used. This commit adds a way of representing these resources in-memory, but they are not enforced as permissions yet. The following permissions are supported: ``` CREATE ALL FUNCTIONS CREATE ALL FUNCTIONS IN KEYSPACE <ks> ALTER ALL FUNCTIONS ALTER ALL FUNCTIONS IN KEYSPACE <ks> ALTER FUNCTION <f> DROP ALL FUNCTIONS DROP ALL FUNCTIONS IN KEYSPACE <ks> DROP FUNCTION <f> AUTHORIZE ALL FUNCTIONS AUTHORIZE ALL FUNCTIONS IN KEYSPACE <ks> AUTHORIZE FUNCTION <f> EXECUTE ALL FUNCTIONS EXECUTE ALL FUNCTIONS IN KEYSPACE <ks> EXECUTE FUNCTION <f> ``` as per https://cassandra.apache.org/doc/latest/cassandra/cql/security.html#cql-permissions	2023-03-09 17:50:19 +01:00
Jan Ciolek	e4a3e2ac14	cql-pytest/test_lwt: test LWT update with empty clustering range Add a test case which performs an LWT UPDATE, but the clustering key has 0 possible values, because it's supposed to be equal to two different values. This currently causes a crash, see https://github.com/scylladb/scylladb/issues/13129 Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-03-09 15:44:10 +01:00
Jan Ciolek	5e5e4c5323	cql-pytest/test_lwt: test LWT update with empty partition range Add a test case which performs an LWT UPDATE, but the partition key has 0 possible values, because it's supposed to be equal to two different values. Such queries used to cause problems in the past. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-03-09 15:43:24 +01:00
Anna Stuchlik	6aff78ded2	doc: Remove Enterprise content from OSS docs Related: https://github.com/scylladb/scylladb/issues/13119 This commit removes the pages that describe Enterprise only features from the Open Source documentation: - Encryption at Rest - Workload Prioritization - LDAP Authorization - LDAP Authentication - Audit In addition, it removes most of the information about Incremental Compaction Strategy (ICS), which is replaced with links to the Enterprise documentation. The changes above required additional updates introduced with this commit: - The links to Enterprise-only features are replaced with the corresponding links in the Enterprise documentation. - The redirections are added for the removed pages to be redirected to the corresponding pages in the Enterprise documentation. This commit must be reverted in the scylla-enterprise repository to avoid deleting the Enterprise-only content from the Enterprise docs. Closes #13123	2023-03-09 15:40:43 +02:00
Botond Dénes	11dde4b80b	reader_permit: add wait_for_execution state Used while the permit is in the _ready_list, waiting for the execution loop to pick it up. This just acknowledging the existence of this wait-state. This state will now show up in permit diagnostics printouts and we can now determine whether a permit is waiting for execution, without checking which queue it is in.	2023-03-09 07:11:51 -05:00
Botond Dénes	6229f8b1a6	reader_concurrency_semaphore: make wait lists intrusive Instead of using expiring_fifo to store queued permits, use the same intrusive list mechanism we use to keep track of all permits. Permits are now moved between the _permit_list and the wait queues, depending on which state they are in. This means _permit_list is now not the definitive list containing all permits, instead it is the list containing all permits that are not in a more specialized queue at the moment. Code wishing to iterate over all permits should now use foreach_permits(). For outside code, this was already the only way and internal users are already patched. Making the wait lists intrusive allows us to dequeue a permit from any position, with nothing but a permit reference at hand. It also means the wait queues don't have any additional memory requirements, other than the memory for the permit itself. Timeout while being queued is now handled by the permit's on_timeout() callback.	2023-03-09 07:11:49 -05:00
Benny Halevy	0f07a24889	storage_service: node_ops_signal_abort: print a warning when signaling abort Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-09 14:10:10 +02:00
Benny Halevy	2a1015dced	storage_service: s/node_ops_singal_abort/node_ops_signal_abort/ Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-09 14:09:09 +02:00
Benny Halevy	6394e9acf7	storage_service: node_ops_abort: add log messages So we can correlate the respective messages on the node_ops coordinator side. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-09 14:04:56 +02:00
Benny Halevy	3652025062	storage_service: wire node_ops_ctl for node operations Use the node_ops_ctl methods for the basic flow of: start, start_heartbeat_updater, prepare, send_to_all, done\|abort As well for querying pending ops for decommission. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-09 14:02:31 +02:00
Botond Dénes	9ea9a48dbc	reader_concurrency_semaphore: move most wait_queue methods out-of-line They will soon depend on the definition of the reader_permit::impl, which is only available in the .cc file.	2023-03-09 06:53:11 -05:00
Botond Dénes	1d27dd8f0e	reader_concurrency_semaphore: store permits directly in queues Instead of the `entry` wrapper. In _wait_list and _ready_list, that is. Data stored in the `entry` wrapper is moved to a new `reader_permit::auxiliary_data` type. This makes the reader permit self-sufficient. This in turn prepares the ground for the ability to de-queue a permit from any queue, with nothing but a permit reference at hand: no need to have back pointer to wrappers and/or iterators.	2023-03-09 06:53:11 -05:00
Botond Dénes	bcfb8715f9	reader_permit: introduce (private) operator * and -> Currently the reader_permit has some private methods that only the semaphore's internal calls. But this method of communication is not consistent, other times the semaphore accesses the permit impl directly, calling methods on that. This commit introduces operator * and -> for reader_permit. With this, the semaphore internals always call the reader_permit::impl methods direcly, either via a direct reference, or via the above operators. This makes the permit internface a little narrower and reduces boilerplate code.	2023-03-09 06:53:11 -05:00
Botond Dénes	f5b80fdfd8	reader_concurrency_semaphore: remove redundant waiters() member There is now a field in stats with the same information, use that.	2023-03-09 06:53:11 -05:00
Botond Dénes	74a5981dbe	reader_concurrency_semaphore: add waiters counter Use it to keep track of all permits that are currently waiting on something: admission, memory or execution. Currently we keep track of size, by adding up the result of size() of the various queues. In future patches we are going to change the queues such that they will not have constant time size anymore, move to an explicit counter in preperation to that. Another change this commit makes is to also include ready list entries in this counter. Permits in the ready list are also waiters, they wait to be executed. Soon we will have a separate wait state for this too.	2023-03-09 06:53:11 -05:00
Botond Dénes	2694aa1078	reader_permit: use check_abort() for timeout Instead of having callers use get_timeout(), then compare it against the current time, set up a timeout timer in the permit, which assigned a new `_ex` member (a `std::exception_ptr`) to the appropriate exception type when it fires. Callers can now just poll check_abort() which will throw when `_ex` is not null. This is more natural and allows for more general reasons for aborting reads in the future. This prepares the ground for timeouts being managed inside the permit, instead of by the semaphore. Including timing out while in a wait queue.	2023-03-09 06:53:09 -05:00
Benny Halevy	d322bbf6ff	storage_service: add node_ops_ctl class to formalize all node_ops flow All node operations we currently support go through similar basic flow and may add some op-specific logic around it. 1. Select the nodes to sync with (this is op specific). 2. hearbeat updater 3. send prepare req 4. perform the body of the node operation 5. send done -- on any error: send abort node_ops_ctl formalizes all those steps and makes sure errors are handled in all steps, and the error causing abort is not masked by errors in the abort processing, and is propagated upstream. Some of the printouts repeat the node operation description to remain backward compatible so not to break dtests that wait for them. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-09 13:48:34 +02:00
Wojciech Mitros	2fd6d495fa	wasm: move compilation to an alien thread The compilation of wasm UDFs is performed by a call to a foreign function, which cannot be divided with yielding points and, as a result, causes long reactor stalls for big UDFs. We avoid them by submitting the compilation task to a non-seastar std::thread, and retrieving the result using seastar::alien. The thread is created at the start of the program. It executes tasks from a queue in an infinite loop. All seastar shards reference the thread through a std::shared_ptr to a `alien_thread_runner`. Considering that the compilation takes a long time anyway, the alien_thread_runner is implemented with focus on simplicity more than on performance. The tasks are stored in an std::queue, reading and writing to it is synchronized using an std::mutex for reading/ writing to the queue, and an std::condition_variable waiting until the queue has elements. When the destructor of the alien runner is called, an std::nullopt sentinel is pushed to the queue, and after all remaining tasks are finished and the sentinel is read, the thread finishes.	2023-03-09 11:54:38 +01:00
Botond Dénes	23f4e250c2	reader_concurrency_semaphore: maybe_dump_permit_diagnostics(): remove permit list param This param is from a time when _permit_list was not accessible from the outside, so it was passed along the semaphore instance to avoid making the diagnostics methods friends. To allow the semaphore freedom in how permits are stored, the diagnostics code is instead made to use foreach_permit(), instead of accessing the underlying list directly. As the diagnostics code wants reader_permit::impl& directly, a new variant of foreach_permit() passing impl references is introduced.	2023-03-09 05:19:59 -05:00
Botond Dénes	59dc15682b	reader_concurrency_semaphroe: make foreach_permit() const It already is conceptually, as it passes const references to the permits it iterates over. The only reason it wasn't const before is a technical issue which is solved here with a const_cast.	2023-03-09 05:19:59 -05:00
Botond Dénes	c86136c853	reader_permit: add get_schema() and get_op_name() accessors	2023-03-09 05:19:59 -05:00
Botond Dénes	9dd2cd07ef	reader_concurrency_semaphore: mark maybe_dump_permit_diagnostics as noexcept It is in fact noexcept and so it is expected to be, so document this.	2023-03-09 05:19:59 -05:00
Benny Halevy	f3d6868738	repair: node_ops_cmd_request: add print function Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-09 11:42:03 +02:00
Benny Halevy	130d6faa06	repair: do_decommission_removenode_with_repair: log ignore_nodes Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-09 11:42:03 +02:00
Benny Halevy	ac13e1f432	repair: replace_with_repair: get ignore_nodes as unordered_set Prepare for following patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-09 11:42:03 +02:00
Benny Halevy	78b0222842	gossiper: get_generation_for_nodes: get nodes as unordered_set Prepare for following patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-09 11:42:03 +02:00
Benny Halevy	28eb11553b	storage_service: don't let node_ops abort failures mask the real error Currently failing to abort a node operation will throw and mask the original failure handled in the catch block. See #12333 for example. Fixes #12798 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-03-09 11:42:03 +02:00
Botond Dénes	49e0d0402d	Merge 'sstables: remove unused function add more constness' from Kefu Chai - sstables: remove unused function - sstables: mark param of sstable::_from_sstring() const - sstables: mark param of reverse_map() const - sstables: mark static lookup table const Closes #13115 github.com:scylladb/scylladb: sstables: mark static lookup table const sstables: mark param of reverse_map() const sstables: mark param of sstable::*_from_sstring() const sstables: remove unused function	2023-03-09 11:29:28 +02:00
Pavel Emelyanov	47df084363	test,sstables: Remove path from make_sstable_easy() The method in question is only called with env's tempdir, so there's no point in explicitly passing it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-09 08:21:48 +03:00
Pavel Emelyanov	8297ac0082	test,lib: Remove wrapper over reusable_sst and move the comment There's a wonderful comment describing what the reusable_sst is for near one of its wrappers. It's better to drop the wrapper and move the comment to where it belongs. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-09 08:21:48 +03:00
Pavel Emelyanov	27d45df35f	test: Make "compact" test case use env dir Same as most of the previous work -- remove the explicit capturing of env's tempdir over the test. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-09 08:21:48 +03:00
Pavel Emelyanov	fdff97a294	test,compaction: Use env tempdir in some more cases Both already do so, but get the tempdir explicitly. It's possible to make them much shorter by not carrying this variable over the code. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-09 08:21:48 +03:00
Pavel Emelyanov	19ef07b059	test,compaction: Make check_compacted_sstables() use env's dir It's in fact using it already via argument. Next patch will do the same with another call, but having this change separately makes the next patch shorter and easier to review. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-09 08:21:48 +03:00
Pavel Emelyanov	ef8928f2cc	test: Relax making sstable with sequential generation Many test cases populate sstable with a factory that at the same time serves as a stable maintainer of a monitomic generation. Those can be greately relaxed by re-using the recently introduced generation from the test_env. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-09 08:21:48 +03:00
Pavel Emelyanov	be7f4ff53a	test/sstable::test_env: Keep track of auto-incrementing generation Lots of test cases make sstables with monotonically incrementing generation values. In Scylla code this counter is maintained in class table, but sstable tests not always have it. To mimic this behavior, the test_env can keep track of the generation, so that callers just don't mess with it (next patch). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-09 08:21:48 +03:00
Pavel Emelyanov	bc20879971	test/lib: Add sstable maker helper without factory There's a make_sstable_containing() helper that creates sstable and populates it with mutations (and makes some post validation). The helper accepts a factory function that should make sstable for it. This patch shuffles this helper a bit by introducing an overload that populates (and validates) the already existing sstable. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-09 08:21:48 +03:00
Pavel Emelyanov	2bbc59dd58	test: Remove last occurrence of test_env::do_with(rval, ...) There's the lonely test case that uses the mentioned template to carry its own instance of tempdir over its lifetime. Patch the case to re-use the already existing env's tempdir and drop the template. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-09 08:21:48 +03:00
Pavel Emelyanov	4bd79dc900	test,sstables: Dont mess with tempdir where possible Beneficiary of the previuous patch -- those cases that make sstables in env's tempdir can now enjoy not mentioning this explicitly and letting the env specify the sstable making path itself. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-09 08:21:48 +03:00
Pavel Emelyanov	dfcfe0a355	test/sstable::test_env: Add dir-less sstables making helpers Lots of (most of) test cases out there generate sstables inside env's temporary directory. This patch adds some sugar to env that will allow test cases omit explicit env.tempdir() call. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-09 08:21:48 +03:00
Pavel Emelyanov	d28589a2f7	test,sstables: Use sstables::test_env's tempdir with sweeper Continuation of the previous patch. Some test cases are sensitive to having the temp directory clean, so patch them similarly, but equip with the sweeper on entry instead of their own temprid instance. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-09 08:21:48 +03:00
Pavel Emelyanov	904853cd7b	test,sstables: Use sstables::test_env's tempdir The one is maintained by the env throughout its lifetime. For many test cases there's no point in generating tempdir on their own, so just switch to using env's one. The code gets longer lines, but this is going to change really soon. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-09 08:21:47 +03:00
Pavel Emelyanov	21e70e7edd	test/lib: Add tempdir sweeper This is a RAII-sh helper that cleans temp directory on destruction. To be used in cases when a test needs to do several checks over clean temporary directory (future patches). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-09 08:21:47 +03:00
Pavel Emelyanov	090e007e30	test/lib: Open-code make_sstabl_easy into make_sstable The former helper is going to get rid of the fs::path& dir argument, but the latter cannot yet live without it. The simplest solution is to open-code the helper until better times. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-09 08:21:47 +03:00
Pavel Emelyanov	8d727701a4	test: Remove vector of mutation interposer from test_key_count_estimation The test generates a vector of mutation to be later passed into make_sstable() helper which just applies them to memtable. The test case can generate memtable directly. This makes it possible to stop using the local tempdir in this test case by future patches. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-09 08:21:47 +03:00
Kefu Chai	87a6cb5925	sstables: mark static lookup table const these tables are mappings from symbolic names to their string representation. we don't mutate them. so mark them const. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-09 12:40:37 +08:00
Kefu Chai	c18709d4a1	sstables: mark param of reverse_map() const it does not mutate the map in which the value is looked up, so let's mark map const. also, take this opportunity to use structured binding for better readability. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-09 12:40:37 +08:00
Kefu Chai	4128ab2029	sstables: mark param of sstable::*_from_sstring() const neither of the changed function mutates the parameter. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-09 12:40:37 +08:00
Kefu Chai	c211b272f7	sstables: remove unused function `sstable::version_from_sstring()` is used nowhere, so let's drop it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-09 12:40:37 +08:00
Avi Kivity	25cf325674	Merge 'api: s/request/http::request/' from Kefu Chai - api: reference httpd::* symbols like 'httpd::' - alternator: using chrono_literals before using it - api: s/request/http::request/ the last two commits were inspired Pavel's comment of > It looks like api/ code was caught by some using namespace seastar::httpd shortcut. they should be landed before we merge and include https://github.com/scylladb/seastar/pull/1536 in Scylla. Closes #13095 github.com:scylladb/scylladb: api: reference httpd::* symbols like 'httpd::*' alternator: using chrono_literals before using it api: s/request/http::request/	2023-03-08 18:08:21 +02:00
Avi Kivity	a96fcdaac6	Merge 'distributed_loader: print log without using fmt::format() and fix of typo' from Kefu Chai - distributed_loader: print log without using fmt::format() - distributed_loader: correct a typo in comment Closes #13108 * github.com:scylladb/scylladb: distributed_loader: correct a typo in comment distributed_loader: print log without using fmt::format()	2023-03-08 17:55:25 +02:00
Kefu Chai	3488b68413	build: cmake: link Boost::regex against ICU::uc Boost::regex references icu_67::Locale::Locale, so let's fix this. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-08 22:53:42 +08:00
Kefu Chai	51ff2907b8	build: cmake: link sstables against libdeflate sstables is the only place where libdefalte is used. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-08 22:53:42 +08:00
Kefu Chai	2a18d470cc	build: cmake: add missing sources to test-lib Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-08 22:53:42 +08:00
Kefu Chai	0b3d25ab1b	build: cmake: add missing linkages these dependencies were found when trying to compile `user_function_test`. whenever a library libfoo references another one, say, libbar, the corresponding linkage from libfoo to libbar is added. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-08 22:53:42 +08:00
Kefu Chai	21a7c439bb	build: cmake: find Snappy before using it Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-08 22:53:42 +08:00
Kefu Chai	c8f762b6d0	build: cmake: extract scylla-main out so tests and other libraries can link against it. also, drop the unused abseil library linkages. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-08 22:53:42 +08:00
Kefu Chai	d07adcbe74	build: cmake: extract index, repair and data_dictionary out Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-08 22:53:42 +08:00
Kefu Chai	b1484a2a5f	build: cmake: document add_scylla_test() this change reuses part of Botond Dénes's work to add a full-blown CMakeLists.txt to build scylla. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-08 22:26:30 +08:00
Kefu Chai	b0433bf82b	build: cmake: remove test which does not exist yet it was an oversight in `11124ee972`, which added a test not yet included master HEAD yet. so let's drop it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-08 22:26:30 +08:00
Nadav Har'El	a4a318f394	cql: USING TTL 0 means unlimited, not default TTL Our documentation states that writing an item with "USING TTL 0" means it should never expire. This should be true even if the table has a default TTL. But Scylla mistakenly handled "USING TTL 0" exactly like having no USING TTL at all (i.e., it took the default TTL, instead of unlimited). We had two xfailing tests demonstrating that Scylla's behavior in this is different from Cassandra. Scylla's behavior in this case was also undocumented. By the way, Cassandra used to have the same bug (CASSANDRA-11207) but it was fixed already in 2016 (Cassandra 3.6). So in this patch we fix Scylla's "USING TTL 0" behavior to match the documentation and Cassandra's behavior since 2016. One xfailing test starts to pass and the second test passes this bug and fails on a different one. This patch also adds a third test for "USING TTL ?" with UNSET_VALUE - it behaves, on both Scylla and Cassandra, like a missing "USING TTL". The origin of this bug was that after parsing the statement, we saved the USING TTL in an integer, and used 0 for the case of no USING TTL given. This meant that we couldn't tell if we have USING TTL 0 or no USING TTL at all. This patch uses an std::optional so we can tell the case of a missing USING TTL from the case of USING TTL 0. Fixes #6447 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13079	2023-03-08 16:18:23 +02:00
Kefu Chai	43b6f7d8d3	distributed_loader: correct a typo in comment s/to many/too many/ Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-08 18:17:43 +08:00
Kefu Chai	b6991f5056	distributed_loader: print log without using fmt::format() logger.info() is able to format the given arguments with the format string, so let's just let it do its job. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-08 18:17:43 +08:00
Alejo Sanchez	f55e91d797	gms, service: live endpoint copy method Move replication logic for live endpoint across shards to a separate method This will be used by API get alive nodes. As this is now in a method and outside gossiper::run(), assert it's called from shard 0. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-03-08 10:45:35 +01:00
Nadav Har'El	beb9a8a9fd	docs/alternator: recommend to disable auto_snapshot In issue #5283 we noted that the auto_snapshot option is not useful in Alternator (as we don't offer any API to restore the snapshot...), and suggested that we should automatically disable this option for Alternator tables. However, this issue has been open for more than three years, and we never changed this default. So until we solve that issue - if we ever do - let's add a paragraph in docs/alternator/alternator.md recommending to the user to disable this option in the configuration themselves. The text explains why, and also provides a link to the issue. Refs #5283 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13103	2023-03-08 10:50:59 +02:00
Jan Ciolek	0417c48bdc	cql-pytest: test unset value in UPDATE and LWT UPDATE Add a test which performs an UPDATE and tries to pass an UNSET_VALUE as a value for the primary key. There is also an LWT variant of this test that tries to set an UNSET_VALUE in the IF condition. These two tests are analogous to test_insert_update_where and test_insert_update_where_lwt, but use an UPDATE instead of INSERT. It's useful to test UPDATE as well as INSERT. When I was developing a fix for #13001 I initially added the condition for unset value inside insert_statement, but this didn't handle update statements. These two tests allowed me to see that UPDATE still causes a crash. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com> Closes #13058	2023-03-08 10:39:26 +02:00
Raphael S. Carvalho	3fae46203d	replica: Fix undefined behavior in table::generate_and_propagate_view_updates() Undefined behavior because the evaluation order is undefined. With GCC, where evaluation is right-to-left, schema will be moved once it's forwarded to make_flat_mutation_reader_from_mutations_v2(). The consequence is that memory tracking of mutation_fragment_v2 (for tracking only permit used by view update), which uses the schema, can be incorrect. However, it's more likely that Scylla will crash when estimating memory usage for row, which access schema column information using schema::column_at(), which in turn asserts that the requested column does really exist. Fixes #13093. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #13092	2023-03-08 07:38:55 +02:00
Nadav Har'El	ef50e4022c	test: drop our "pytest" wrapper script When Fedora 37 came out, we discovered that its "pytest" script started to run Python with the "-s" option, which caused problems for packages installed personally via pip. We fixed this by adding our own wrapper script test/pytest. But this bug (https://bugzilla.redhat.com/show_bug.cgi?id=2152171) was already fixed in Fedora 37, and the new version already reached our dbuild. So we no longer need this wrapper script. Let's remove it. Fixes #12412 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13083	2023-03-08 07:31:37 +02:00
Jan Ciolek	63a7235017	prepare_expr: improve readability in cast_prepare_expression cast_prepare_expression takes care of preparing expr::cast, which is responsible for CQL C-style casts. At the first glance it can be hard to figure out what exactly does it do, so I added some comments to make things clearer. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-03-08 03:24:17 +01:00
Jan Ciolek	03d37bdc14	cql-pytest: test expr::cast in test_cast.py CQL supports C-style casts with the destination type specified inside parenthesis e.g `blob_column = (blob)funcThatReturnsInt()`. These casts can be used to convert values of types that have compatible binary representation, or as a type hint to specify the type where the situation is ambiguous. I didn't find any cql-pytest tests for this feature, so I added some. It looks like the feature works, but only partially. Doing things like this works: `blob_column = (blob)funcThatReturnsInt()` But trying to do something a bit more complex fails: `blob_column = (blob)(int)1234` This is the case in both Cassandra and Scylla, the tests introduced in this commit pass on both of them. In future commits I will extend this feature to support the more complex cases as well, then some tests will have to be marked scylla_only. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-03-08 03:24:13 +01:00
Nadav Har'El	cdedc79050	cql: add configurable restriction of minimum RF We have seen users unintentionally use RF=1 or RF=2 for a keyspace. We would like to have an option for a minimal RF that is allowed. Cassandra recently added, in Cassandra 4.1 (see apache/cassandra@5fdadb2 and https://issues.apache.org/jira/browse/CASSANDRA-14557), exactly such a option, called "minimum_keyspace_rf" - so we chose to use the same option name in Scylla too. This means that unlike the previous "safe mode" options, the name of this option doesn't start with "restrict_". The value of the minimum_keyspace_rf option is a number, and lower replication factors are rejected with an error like: cqlsh> CREATE KEYSPACE x WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor': 2 }; ConfigurationException: Replication factor replication_factor=2 is forbidden by the current configuration setting of minimum_keyspace_rf=3. Please increase replication factor, or lower minimum_keyspace_rf set in the configuration. This restriction applies to both CREATE KEYSPACE and ALTER KEYSPACE operations. It applies to both SimpleStrategy and NetworkTopologyStrategy, for all DCs or a specific DC. However, a replication factor of zero (0) is not forbidden - this is the way to explicitly request not to replicate (at all, or in a specific DC). For the time being, minimum_keyspace_rf=0 is still the default, which means that any replication factor is allowed, as before. We can easily change this default in a followup patch. Note that in the current implementation, trying to use RF below minimum_keyspace_rf is always an error - we don't have a syntax to make into just a warning. In any case the error message explains exactly which configuration option is responsible for this restriction. Fixes #8891. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #9830	2023-03-07 19:04:06 +02:00
Kamil Braun	2b44631ded	Merge 'storage_service: Make node operations safer by detecting asymmetric abort' from Tomasz Grabiec This patch fixes a problem which affects decommission and removenode which may lead to data consistency problems under conditions which lead one of the nodes to unliaterally decide to abort the node operation without the coordinator noticing. If this happens during streaming, the node operation coordinator would proceed to make a change in the gossiper, and only later dectect that one of the nodes aborted during sending of decommission_done or removenode_done command. That's too late, because the operation will be finalized by all the nodes once gossip propagates. It's unsafe to finalize the operation while another node aborted. The other node reverted to the old topolgy, with which they were running for some time, without considering the pending replica when handling requests. As a result, we may end up with consistency issues. Writes made by those coordinators may not be replicated to CL replicas in the new topology. Streaming may have missed to replicate those writes depending on timing. It's possible that some node aborts but streaming succeeds if the abort is not due to network problems, or if the network problems are transient and/or localized and affect only heartbeats. There is no way to revert after we commit the node operation to the gossiper, so it's ok to close node_ops sessions before making the change to the gossiper, and thus detect aborts and prevent later aborts after the change in the gossiper is made. This is already done during bootstrap (RBNO enabled) and replacenode. This patch canges removenode to also take this approach by moving sending of remove_done earlier. We cannot take this approach with decommission easily, because decommission_done command includes a wait for the node to leave the ring, which won't happen before the change to the gossiper is made. Separating this from decommission_done would require protocol changes. This patch adds a second-best solution, which is to check if sessions are still there right before making a change to the gossiper, leaving decommission_done where it was. The race can still happen, but the time window is now much smaller. The PR also lays down infrastructure which enables testing the scenarios. It makes node ops watchdog periods configurable, and adds error injections. Fixes #12989 Refs #12969 Closes #13028 * github.com:scylladb/scylladb: storage_service: node ops: Extract node_ops_insert() to reduce code duplication storage_service: Make node operations safer by detecting asymmetric abort storage_service: node ops: Add error injections service: node_ops: Make watchdog and heartbeat intervals configurable	2023-03-07 17:36:51 +01:00
Nadav Har'El	e69c9069d6	Merge 'build: enable more warnings' from Kefu Chai when comparing the disabled warnings specified by `configured.py` and the ones specified by `cmake/mode.common.cmake`, it turns out we are now able to enable more warning options. so let's enable them. the change was tested using Clang-17 and GCC-13. there are many errors from GCC-13, like: ``` /home/kefu/dev/scylladb/db/view/view.hh:114:17: error: declaration of ‘column_kind db::view::clustering_or_static_row::column_kind() const’ changes meaning of ‘column_kind’ [-fpermissive] 114 \| column_kind column_kind() const { \| ^~~~~~~~~~~ ``` so the build with GCC failed. and with this change, Clang-17 is able to build build the tree without warnings. Closes #13096 * github.com:scylladb/scylladb: build: enable more warnings test: do not initialize plain number with {} test: do not initialize a time_t with braces	2023-03-07 17:37:54 +02:00
Wojciech Mitros	4609a45ce3	wasm: convert compilation to a future After we move the compilation to a alien thread, the completion of the compilation will be signaled by fulfilling a seastar promise. As a result, the `precompile` function will return a future, and because of that, other functions that use the `precompile` functions will also become futures. We can do all the neccessary adjustments beforehand, so that the actual patch that moves the compilation will contain less irrelevant changes.	2023-03-07 14:27:38 +01:00
Avi Kivity	6aa91c13c5	Merge 'Optimize topology::compare_endpoints' from Benny Halevy The code for compare_endpoints originates at the dawn of time (`bc034aeaec`) and is called on the fast path from storage_proxy via `sort_by_proximity`. This series considerably reduces the function's footprint by: 1. carefully coding the many comparisons in the function so to reduce the number of conditional banches (apparently the compiler isn't doing a good enough job at optimizing it in this case) 2. avoid sstring copy in topology::get_{datacenter,rack} Closes #12761 * github.com:scylladb/scylladb: topology: optimize compare_endpoints to_string: add print operators for std::{weak,partial}_ordering utils: to_sstring: deinline std::strong_ordering print operator move to_string.hh to utils/ test: network_topology: add test_topology_compare_endpoints	2023-03-07 15:17:19 +02:00
Kamil Braun	fe14d14ce9	Merge 'Eliminate extraneous copies of dht::token_range_vector' from Benny Halevy In several places we copy token range vectors where we could move them and eliminate unnecessary memory copies. Ref #11005 Closes #12344 * github.com:scylladb/scylladb: dht/range_streamer: stream_async: move ranges_to_stream to do_streaming streaming: stream_session: maybe_yield streaming: stream_session: prepare: move token ranges to add_transfer_ranges streaming: stream_plan: transfer_ranges: move token ranges towards add_transfer_ranges dht/range_streamer: stream_async: do_streaming: move ranges downstream dht/range_streamer: add_ranges: clear_gently ranges_for_keyspace dht/range_streamer: get_range_fetch_map: reduce copies dht/range_streamer: add_ranges: move ranges down-stream dht/boot_strapper: move ranges to add_ranges dht/range_streamer: stream_async: incrementally update _nr_ranges_remaining dht/range_streamer: stream_async: erase from range_vec only after do_streaming success	2023-03-07 13:46:33 +01:00
Nadav Har'El	f05ea80fb5	test/cql-pytest: remove unused async marker One test in test/cql-pytest/test_batch.py accidentally had the asyncio marker, despite not using any async features. Remove it. The test still runs fine. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13002	2023-03-07 14:33:34 +02:00
Botond Dénes	3f0ace0114	Merge 'cmake: sync with `configure.py` (10/n)' from Kefu Chai - build: cmake: use different names for output of check_cxx_compiler_flag - build: cmake: only add supported warning flags to CMAKE_CXX_FLAGS - build: cmake: limit the number of link job Closes #13098 * github.com:scylladb/scylladb: build: cmake: limit the number of link job build: cmake: only add supported warning flags to CMAKE_CXX_FLAGS build: cmake: use different names for output of check_cxx_compiler_flag	2023-03-07 14:24:26 +02:00
Kefu Chai	063b3be8a7	api: reference httpd::* symbols like 'httpd::' it turns out we have `using namespace httpd;` in seastar's `request_parser.rl`, and we should not rely on this statement to expose the symbols in `seatar::httpd` to `seastar` namespace. in this change, api/.hh: all httpd symbols are referenced by `httpd::` instead of being referenced as if they are in `seastar`. * api/*.cc: add `using namespace seastar::httpd`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-07 18:21:03 +08:00
Kefu Chai	a37610f66a	alternator: using chrono_literals before using it we should assume that some included header does this for us. we'd have following compiling failure if seastar's src/http/request_parser.rl does not `using namespace httpd;` anymore. ``` /home/kefu/dev/scylladb/alternator/streams.cc:433:55: error: no matching literal operator for call to 'operator""h' with argument of type 'unsigned long long' or 'const char *', and no matching literal operator template static constexpr auto dynamodb_streams_max_window = 24h; ^ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-07 18:20:36 +08:00
Vlad Zolotarov	ae6724f155	transport: refactor CQL metrics This patch reorganizes and extends CQL related metrics. Before this patch we only had counters for specific CQL requests. However, many times we need to reason about the size of CQL queries: corresponding requests and response sizes. This patch adds corresponding metrics: - Arranges all 3 per-opcode statistics counters in a single struct. - Defines a vector of such structs for each CQL opcode. - Adjusts statistics updates accordingly - the code is much simpler now. - Removes old metrics that were accounting some CQL opcodes. - Adds new per-opcode metrics for requests number, request and response sizes: - New metrics are of a derived kind - rate() should be applied to them. - There are 3 new metrics names: - 'cql_requests_count' - 'cql_request_bytes' - 'cql_response_bytes' - New metrics have a per-opcode label - 'kind'. For example: A number of response bytes for an EXECUTE opcode on shard 0 looks as follows: scylla_transport_cql_response_bytes{kind="EXECUTE",shard="0"} Ref #13061 Signed-off-by: Vlad Zolotarov <vladz@scylladb.com> Message-Id: <20230302154816.299721-1-vladz@scylladb.com>	2023-03-07 12:02:34 +02:00
Kefu Chai	577b1c679c	build: enable more warnings when comparing the disabled warnings specified by `configured.py` and the ones specified by `cmake/mode.common.cmake`, it turns out we are now able to enable more warning options. so let's enable them. the change was tested using Clang-17 and GCC-13. there are many errors from GCC-13, like: ``` /home/kefu/dev/scylladb/db/view/view.hh:114:17: error: declaration of ‘column_kind db::view::clustering_or_static_row::column_kind() const’ changes meaning of ‘column_kind’ [-fpermissive] 114 \| column_kind column_kind() const { \| ^~~~~~~~~~~ ``` so the build with GCC failed. and with this change, Clang-17 is able to build build the tree without warnings. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-07 17:54:53 +08:00
Kefu Chai	f0659cb1bb	test: do not initialize plain number with {} this silences warnings like: ``` test/boost/secondary_index_test.cc:1578:5: error: braces around scalar initializer [-Werror,-Wbraced-scalar-init] { -7509452495886106294 }, ^~~~~~~~~~~~~~~~~~~~~~~~ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-07 17:54:53 +08:00
Kefu Chai	7331edbc7a	test: do not initialize a time_t with braces time_t is defined as a "Arithmetic type capable of representing times". so we can just initialize it with 0 without braces. this change should silence warning like: ``` test/boost/aggregate_fcts_test.cc:238:45: error: braces around scalar initializer [-Werror,-Wbraced-scalar-init] auto tp = db_clock::from_time_t({ 0 }) + std::chrono::milliseconds(1); ^~~~~ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-07 17:54:53 +08:00
Pavel Emelyanov	a0718d2097	test: Don't populate / with sstables The sstable_compaction_test::simple_backlog_controller_test makes sstables with empty dir argument. Eventually this means that sstables happen in / directory [1], which's not nice. As a side effect this also makes sstable::storage::prefix() returns empty string which, in turn, confuses the code that tries to analyze the prefix contents (refs: #13090) [1] See, e.g. logs from https://jenkins.scylladb.com/job/releng/job/Scylla-CI/4757/consoleText ``` INFO 2023-03-06 21:23:04,536 [shard 0] compaction - [Compact ks.cf 51489760-bc54-11ed-a08c-7d3f1d77e2e4] Compacting [/la-1-big-Data.db:level=0:origin=] ``` Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13094	2023-03-07 11:44:33 +02:00
Kefu Chai	4da82b4117	data_dictionary: mark dtor of user_types_storage `virtual` we have another solution, to mark db_user_types_storage `final`. as we don't destruct `db_user_types_storage` with a pointer to any of its base classes. but it'd be much simpler to just mark the dtor virtual of the first base class which has virtual method(s). it's much idiomatic this way, and less error-prune. this change should silence following warning: ``` /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/stl_construct.h:88:2: error: destructor called on non-final 'replica::db_user_types_storage' that has virtual functions but non-virtual destructor [-Werror,-Wdelete-non-abstract-non-virtual-dtor] __location->~_Tp(); ^ /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/stl_construct.h:149:12: note: in instantiation of function template specialization 'std::destroy_at<replica::db_user_types_storage>' requested here std::destroy_at(__pointer); ^ /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/alloc_traits.h:674:9: note: in instantiation of function template specialization 'std::_Destroy<replica::db_user_types_storage>' requested here { std::_Destroy(__p); } ^ /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/shared_ptr_base.h:613:28: note: in instantiation of function template specialization 'std::allocator_traits<std::allocator<void>>::destroy<replica::db_user_types_storage>' requested here allocator_traits<_Alloc>::destroy(_M_impl._M_alloc(), _M_ptr()); ^ /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/shared_ptr_base.h:599:2: note: in instantiation of member function 'std::_Sp_counted_ptr_inplace<replica::db_user_types_storage, std::allocator<void>, __gnu_cxx::_S_atomic>::_M_dispose' requested here _Sp_counted_ptr_inplace(_Alloc __a, _Args&&... __args) ^ /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/shared_ptr_base.h:972:6: note: in instantiation of function template specialization 'std::_Sp_counted_ptr_inplace<replica::db_user_types_storage, std::allocator<void>, __gnu_cxx::_S_atomic>::_Sp_counted_ptr_inplace<replica::database &>' requested here _Sp_cp_type(__a._M_a, std::forward<_Args>(__args)...); ^ /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/shared_ptr_base.h:1712:14: note: in instantiation of function template specialization 'std::__shared_count<>::__shared_count<replica::db_user_types_storage, std::allocator<void>, replica::database &>' requested here : _M_ptr(), _M_refcount(_M_ptr, __tag, std::forward<_Args>(__args)...) ^ /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/shared_ptr.h:464:4: note: in instantiation of function template specialization 'std::__shared_ptr<replica::db_user_types_storage>::__shared_ptr<std::allocator<void>, replica::database &>' requested here : __shared_ptr<_Tp>(__tag, std::forward<_Args>(__args)...) ^ /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/shared_ptr.h:1009:14: note: in instantiation of function template specialization 'std::shared_ptr<replica::db_user_types_storage>::shared_ptr<std::allocator<void>, replica::database &>' requested here return shared_ptr<_Tp>(_Sp_alloc_shared_tag<_Alloc>{__a}, ^ /home/kefu/dev/scylladb/replica/database.cc:313:24: note: in instantiation of function template specialization 'std::make_shared<replica::db_user_types_storage, replica::database &>' requested here , _user_types(std::make_shared<db_user_types_storage>(*this)) ^ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13062	2023-03-07 10:36:03 +02:00
Wojciech Mitros	d4851ccae7	treewide: rename the "xwasm" UDF language to "wasm" When the WASM UDFs were first introduced, the LANGUAGE required in the CQL statements to use them was "xwasm", because the ABI for the UDFs was still not specified and changes to it could be backwards incompatible. Now, the ABI is stabilized, but if backwards incompatible changes are made in the future, we will add a new ABI version for them, so the name "xwasm" is no longer needed and we can finally change it to "wasm". Closes #13089	2023-03-07 10:21:11 +02:00
Botond Dénes	d1619eb38a	Merge 'Remove qctx from helpers that retrieve truncation record' from Pavel Emelyanov There are two places that do it -- commitlog and batchlog replayers. Both can have local system-keyspace reference and use system-keyspace local query-processor for it. The peering save_truncation_record() is not that simple and is not patched by this PR Closes #13087 * github.com:scylladb/scylladb: system_keyspace: Unstatic get_truncation_record() system_keyspace: Unstatic get_truncated_at() batchlog_manager: Add system_keyspace dependency main: Swap batchlog manager and system keyspace starts system_keyspace: Unstatic get_truncated_position() system_keyspace: Remove unused method commitlog: Create commitlog_replayer with system keyspace test: Make cql_test_env::get_system_keyspace() return sharded commiltlog: Line-up field definitions	2023-03-07 10:19:55 +02:00
Nadav Har'El	e7f9e57d64	docs/alternator: link to issue about too many stream shards docs/alternator/compatibility.md mentions a known problem that Alternator Streams are divided into too many "shards". This patch add a link to a github issue to track our work on this issue - like we did for most other differences mentioned in compatibility.md. Refs #13080 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13081	2023-03-07 10:04:13 +02:00
Kefu Chai	b25a6d5a9c	build: cmake: limit the number of link job this mirrors the settings in `configure.py`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-07 15:34:12 +08:00
Kefu Chai	5e38845057	build: cmake: only add supported warning flags to CMAKE_CXX_FLAGS Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-07 15:24:02 +08:00
Kefu Chai	2b23de31ca	build: cmake: use different names for output of check_cxx_compiler_flag * use the value of disabled_warnings, not the variable name for warning options, otherwise we'd checking options like `-Wno-disabled_warnings`. * use different names for the output of check_cxx_compiler_flag() calls. as the output variable of check_cxx_compiler_flag(..) call is cached, we cannot reuse it for checking different warning options, Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-07 15:24:02 +08:00
Kefu Chai	5522080f80	api: s/request/http::request/ seastar::httpd::request was deprecated in favor of `seastar::http::request` since bdd5d929891d2cb821eca25896e25ed4ff658b7a. so let's use the latter. this change also silences the warning of: ``` /home/kefu/dev/scylladb/api/authorization_cache.cc: In function ‘void api::set_authorization_cache(http_context&, seastar::httpd::routes&, seastar::sharded<auth::service>&)’: /home/kefu/dev/scylladb/api/authorization_cache.cc:19:104: error: ‘using seastar::httpd::request = struct seastar::http::request’ is deprecated: Use http::request instead [-Werror=deprecated-declarations] 19 \| httpd::authorization_cache_json::authorization_cache_reset.set(r, [&auth_service] (std::unique_ptr<request> req) -> future<json::json_return_type> { \| ^~~~~~~ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-07 14:03:42 +08:00
Botond Dénes	2f4a793457	reader_concurrency_semaphore:: clear_inactive_reads(): defer evicting to evict() Instead of open-coding the same, in an incomplete way. clear_inactive_reads() does incomplete eviction in severeal ways: * it doesn't decrement _stats.inactive_reads * it doesn't set the permit to evicted state * it doesn't cancel the ttl timer (if any) * it doesn't call the eviction notifier on the permit (if there is one) The list goes on. We already have an evict() method that all this correctly, use that instead of the current badly open-coded alternative. This patch also enhances the existing test for clear_inactive_reads() and adds a new one specifically for `stop()` being called while having inactive reads. Fixes: #13048 Closes #13049	2023-03-07 08:45:04 +03:00
Kefu Chai	cee597560a	build: enable `-Wdefaulted-function-deleted` warning in general, the more static analysis the merrier. with the updated Seastar, which includes the commit of "core/sstring: define <=> operator for sstring", all defaulted '<=> operator' which previously rely on sstring's operator<=> will not be deleted anymore, so we can enable `-Wdefaulted-function-deleted` now. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12861	2023-03-06 18:41:44 +02:00
Kefu Chai	020483aa59	Update seastar submodule and main this change also includes change to main, to make this commit compile. see below: * seastar 9b6e181e42...9cbc1fe889 (46): > Merge 'Make io-tester jobs share sched classes' from Pavel Emelyanov > io_tester.md: Update the `rps` configuration option description > io_tester: Add option to limit total number of requests sent > Merge 'Keep outgoing queue all cancellable while negotiating (again)' from Pavel Emelyanov > io_tester: Add option to share classes between jobs > rpc: Abort connection if send_entry() fails > Merge 'build: build dpdk with `-fPIC` if BUILD_SHARED_LIBS' from Kefu Chai > build: cooking.sh: use the same BUILD_SHARED_LIBS when building ingredients > build: cooking.sh: use the same generator when building ingredients > core/memory: handle `strerror_r` returning static string > Merge 'build, rpc: lz4 related cleanups' from Kefu Chai > build, rpc: do not support lz4 < 1.7.3 > build: set the correct version when finding lz4 > build: include CheckSymbolExists > rpc: do not include lz4.h in header > build: set CMP0135 for Cooking.cmake > docs: drop building-.md > Merge 'seastar-addr2line: cleanups' from Kefu Chai > seastar-addr2line: refactor tests using unittest > seastar-addr2line: extract do_test() and main() > seastar-addr2line: do not import unused modules > scheduling: add a `rename` callback to scheduling_group_key_config > reactor: syscall thread: wakeup up reactor with finer granularity > build: build dpdk with `-fPIC` if BUILD_SHARED_LIBS > build: extract dpdk_extra_cflags out > core/sstring: remove a temporary variable > Merge 'treewide: include what we use, and add a checkheaders target' from Kefu Chai > perftune.py: auto-select the same number of IRQ cores on each NUMA > prometheus: remove unused headers > core/sstring: define <=> operator for sstring > Merge 'core: s/reserve_additional_memory/reserve_additional_memory_per_shard/' from Kefu Chai > include: do not include <concepts> directly > coding_style: note on self-contained header requirement > circileci: build checkheaders in addition to default target > build: add checkheaders target > net/toeplitz: s/u_int/unsigned/ > net/tcp-stack: add forward declaration for seastar::socket > core, net, util: include used headers main: set reserved memory for wasm on per-shard basis this change is a follow-up of `f05d612da8` and `4a0134a097`. this change depends on the related change in Seastar to reserve additional memory on a per-shard basis. per Wojciech Mitros's comment: > it should have probably been 50MB per shard in other words, as we always execute the same set of udf on all shards. and since one cannot predict the number of shards, but she could have a rough estimation on the size of memory a regular (set of) udf could use. so a per-shard setting makes more sense. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-06 18:41:34 +02:00
Jan Ciolek	aa604bd935	cql3: preserve binary_operator.order in search_and_replace There was a bug in `expr::search_and_replace`. It doesn't preserve the `order` field of binary_operator. `order` field is used to mark relations created using the SCYLLA_CLUSTERING_BOUND. It is a CQL feature used for internal queries inside Scylla. It means that we should handle the restriction as a raw clustering bound, not as an expression in the CQL language. Losing the SCYLLA_CLUSTERING_BOUND marker could cause issues, the database could end up selecting the wrong clustering ranges. Fixes: #13055 Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com> Closes #13056	2023-03-06 16:28:06 +02:00
Kefu Chai	6b249dd301	utils: UUID: throw marshal_exception when fail to parse uuid * throw marshal_exception if not the whole string is parsed, we should error out if the parsed string contains gabage at the end. before this change, we silent accept uuid like "ce84997b-6ea2-4468-9f02-8a65abf4wxyz", and parses it as "ce84997b-6ea2-4468-9f02-8a65abf4". this is not correct. * throw marshal_exception if stoull() throws, `stoull()` throws if it fails to parse a string to an unsigned long long, we should translate the exception to `marshal_exception`, so we can handle these exception in a consistent manner. test is updated accordingly. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13069	2023-03-06 12:59:41 +02:00
Pavel Emelyanov	1be9b0df50	system_keyspace: Unstatic get_truncation_record() Now when both callers of this method are non-static, it can be made non-static too. While at it make two more changes: 1. move the thing to private 2. remove explicit cql3::query_processor::cache_internal::yes argument, the system_keyspace::execute_cql() applies it on itw own Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-06 13:28:40 +03:00
Pavel Emelyanov	109e032f61	system_keyspace: Unstatic get_truncated_at() It's called from batchlog replayer which now has local system keyspace reference and can use it Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-06 13:28:40 +03:00
Pavel Emelyanov	1907518034	batchlog_manager: Add system_keyspace dependency The manager will need system ks to get truncation record from, so add it explicitly. Start-stop sequence no allows that Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-06 13:28:40 +03:00
Pavel Emelyanov	40b762b841	main: Swap batchlog manager and system keyspace starts The former needs the latter to get truncation records from and will thus need it as explicit dependency. In order to have it bathlog needs to start after system ks. This works as starting batchlog manager doesn't do anything that's required by system keyspace. This is indirectly proven by cql-test-env in which batchlog manager starts later than it does in main Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-06 13:28:40 +03:00
Pavel Emelyanov	dcbe3e467b	system_keyspace: Unstatic get_truncated_position() It's called from commitlog replayer which has system keyspace instance on board and can use it Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-06 13:28:40 +03:00
Pavel Emelyanov	2501ba3887	system_keyspace: Remove unused method The get_truncated_position() overload that filters records by shard is nowadays unused. Drop one Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-06 13:28:40 +03:00
Pavel Emelyanov	47b61389b5	commitlog: Create commitlog_replayer with system keyspace The replayer code needs system keyspace to fetch truncation records from, thus it needs this explicit dependency. By the time it runs system keyspace is fully initialized already Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-06 13:28:36 +03:00
Kefu Chai	ac575d0b0e	auth: use zero initialization instead of passing '0' in the initializer list to do aggregate initialization, just use zero initialization. simpler this way. also, this helps to silence a `-Wmissing-braces` warning, like ``` /home/kefu/dev/scylladb/auth/passwords.cc:21:43: error: suggest braces around initialization of subobject [-Werror,-Wmissing-braces] static thread_local crypt_data tlcrypt = {0, }; ^ {} ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13060	2023-03-06 12:28:10 +02:00
Kefu Chai	36da27f2e0	sstables: generation_type: do not specialize to_sstring because `seastar::to_sstring()` defaults to `fmt::format_to()`. so any type which is supported by `fmt::formatter()` is also supported by `seastar::to_sstring()`. and the behavior of existing implementation is exactly the same as the defaulted one. so let's drop the specialization and let `fmt::formatter<sstables::generation_type>` do its job. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13070	2023-03-06 12:18:00 +02:00
Pavel Emelyanov	6f9924ff44	test: Make cql_test_env::get_system_keyspace() return sharded It now returns sys_ks.local(), but next patch would need the whole sharded reference Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-06 13:17:21 +03:00
Pavel Emelyanov	73ab1bd74b	commiltlog: Line-up field definitions Just a cosmetic change, so that next patch adding a new member to the class looks nice Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-06 13:15:27 +03:00
Alejo Sanchez	eaed778f4a	test/cql-pytest: print driver version Print driver version for cql-pytest tests. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Closes #12840	2023-03-06 11:31:26 +02:00
Botond Dénes	4919b2f956	Merge 'cmake: sync with `configure.py` (9/n)' from Kefu Chai - build: cmake: find ANTLR3 before using it - build: cmake: define FMT_DEPRECATED_OSTREAM - build: cmake: add include directory for lua - build: cmake: link redis against db Closes #13071 * github.com:scylladb/scylladb: build: cmake: add more tests build: cmake: find and link against RapidJSON build: cmake: link couple libraries as whole archive build: cmake: find ANTLR3 before using it build: cmake: define FMT_DEPRECATED_OSTREAM build: cmake: add include directory for lua build: cmake: link redis against db	2023-03-06 08:52:13 +02:00
Avi Kivity	97f315cc29	Merge 'build: reenable disabled warnings' from Kefu Chai in general, the more static analysis the merrier. these warnings were previously added to silence warnings from Clang and/or GCC, but since we've addressed all of them, let's reenable them to detect potential issues early. Closes #13063 * github.com:scylladb/scylladb: build: reenable disabled warnings test: lib: do not return a local reference dht: incremental_owned_ranges_checker: use lower_bound() types: reimplement in terms of a variable template query_id: extract into new header test/cql-pytest: test for CLUSTERING ORDER BY verification in MV test/cql-pytest: allow "run-cassandra" without building Scylla build: reenable unused-{variable,lambda-capture} warnings test: reader_concurrency_semaphore_test: define target_memory in debug mode flat_mutation_reader_test: cleanup, seastar::async -> SEASTAR_THREAD_TEST_CASE make_nonforwardable: test through run_mutation_source_tests make_nonforwardable: next_partition and fast_forward_to when single_partition is true make_forwardable: fix next_partition flat_mutation_reader_v2: drop forward_buffer_to nonforwardable reader: fix indentation nonforwardable reader: refactor, extract reset_partition nonforwardable reader: add more tests nonforwardable reader: no partition_end after fast_forward_to() nonforwardable reader: no partition_end after next_partition() nonforwardable reader: no partition_end for empty reader api::failure_detector: mark set_phi_convict_threshold unimplemented test: memtable_test: mark dummy variable for loop [[maybe_unused]] idl-compiler: mark captured this used raft: reference this explicitly util/result_try: reference this explicitly sstables/sstables: mark dummy variable for loop [[maybe_unused]] treewide: do not define/capture unused variables service: storage_service: clear _node_ops in batch cql-pytest: add tests for sum() aggregate build: cmake: extract mutation,db,replica,streaming out build: cmake: link the whole auth build: cmake: extract thrift out build: cmake: expose scylla_gen_build_dir from "interface" build: cmake: find libxcrypt before using it build: cmake: find Thrift before using it build: cmake: support thrift < 0.11.0 test/cql-pytest: move aggregation tests to one file Revert "Revert "storage_service: Enable Repair Based Node Operations (RBNO) by default for all node ops"" storage_service: Wait for normal state handler to finish in replace storage_service: Wait for normal state handler to finish in bootstrap row_cache: pass partition_start though nonforwardable reader doc: fix the version in the comment on removing the note doc: specify the versions where Alternator TTL is no longer experimental	2023-03-05 17:37:33 +02:00
Kefu Chai	6742493a94	build: reenable disabled warnings in general, the more static analysis the merrier. these warnings were previously disabled to silence warnings from Clang and/or GCC, but since we've addressed all of them, let's reenable them to detect potential issues early. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-05 17:37:33 +02:00
Kefu Chai	fe80b5e0d0	test: lib: do not return a local reference the type of return value of `get_table_views()` is a reference, so we cannot return a reference to a temporary value. in this change, a member variable is added to hold the _table_schema, so it can outlive the function call. this should silence following warning from Clang: ``` test/lib/expr_test_utils.cc:543:16: error: returning reference to local temporary object [-Werror,-Wreturn-stack-address] return {view_ptr(_table_schema)}; ^~~~~~~~~~~~~~~~~~~~~~~~~ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-05 17:37:33 +02:00
Kefu Chai	11124ee972	build: cmake: add more tests Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-04 13:11:25 +08:00
Kefu Chai	eeb8553305	build: cmake: find and link against RapidJSON despite that RapidJSON is a header-only library, we still need to find it and "link" against it for adding the include directory. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-04 13:11:25 +08:00
Kefu Chai	c5d1a69859	build: cmake: link couple libraries as whole archive turns out we are using static variables to register entries in global registries, and these variables are not directly referenced, so linker just drops them when linking the executables or shared libraries. to address this problem, we just link the whole archive. another option would be create a linker script or pass --undefined=<symbol> to linker. neither of them is straightforward. a helper function is introduced to do this, as we cannot use CMake 3.24 as yet. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-04 13:11:25 +08:00
Kefu Chai	58f13dfa0a	build: cmake: find ANTLR3 before using it if ANTLR3's header files are not installed into the /usr/include, or other directories searched by compiler by default. there are chances, we cannot build the tree. so we have to find it first. as /opt/scylladb is the directory where `scylla-antlr35-c++-dev` is installed on debian derivatives, this directory is added so the find package module can find the header files. ``` In file included from /home/kefu/dev/scylla/db/legacy_schema_migrator.cc:38: In file included from /home/kefu/dev/scylla/cql3/util.hh:21: /home/kefu/dev/scylla/build/cmake/cql3/CqlParser.hpp:55:10: fatal error: 'antlr3.hpp' file not found ^~~~~~~~~~~~ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-04 13:11:25 +08:00
Kefu Chai	914ba1329d	build: cmake: define FMT_DEPRECATED_OSTREAM otherwise the tree would file to compile with fmt v9. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-04 13:11:25 +08:00
Kefu Chai	b6a927ce3f	build: cmake: add include directory for lua otherwise there are chances the compiler cannot find the lua header(s). Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-04 13:11:25 +08:00
Kefu Chai	e72321f873	build: cmake: link redis against db otherwise, we'd have ``` In file included from /home/kefu/dev/scylla/redis/keyspace_utils.cc:19: In file included from /home/kefu/dev/scylla/db/query_context.hh:14: In file included from /home/kefu/dev/scylla/cql3/query_processor.hh:24: In file included from /home/kefu/dev/scylla/lang/wasm_instance_cache.hh:19: /home/kefu/dev/scylla/lang/wasm.hh:14:10: fatal error: 'rust/wasmtime_bindings.hh' file not found ^~~~~~~~~~~~~~~~~~~~~~~~~~~ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-04 13:11:25 +08:00
Anna Stuchlik	4b71f87594	doc: Update the documentation landing page This commit makes the following changes to the docs landing page: - Adds the ScyllaDB enterprise docs as one of three tiles. - Modifies the three tiles to reflect the three flavors of ScyllaDB. - Moves the "New to ScyllaDB? Start here!" under the page title. - Renames "Our Products" to "Other Products" to list the products other than ScyllaDB itself. In addtition, the boxes are enlarged from to large-4 to look better. The major purpose of this commit is to expose the ScyllaDB documentation. docs: fix the link Closes #13065	2023-03-03 15:48:30 +02:00
Botond Dénes	fb898d214c	Merge 'Shard major compaction task' from Aleksandra Martyniuk Implementation of task_manager's task that covers major keyspace compaction on one shard. Closes #12662 * github.com:scylladb/scylladb: test: extend major keyspace compaction tasks test compaction: create task manager's task for major keyspace compaction on one shard	2023-03-02 15:06:31 +02:00
Botond Dénes	91d64372db	Merge 'cmake: sync with `configure.py` (8/n)' from Kefu Chai - build: cmake: extract more subsystem out into its own CMakeLists.txt - build: cmake: remove swagger_gen_files - build: cmake: remove stale TODO comments - build: cmake: expose scylla_gen_build_dir - build: cmake: link against cryptopp - build: cmake: add missing source to utils - build: cmake: move lib sources into test-lib - build: cmake: add test/perf Closes #13059 * github.com:scylladb/scylladb: build: cmake: add expr_test test build: cmake: allow test to specify the sources build: cmake: add test/perf build: cmake: move lib sources into test-lib build: cmake: add missing source to utils build: cmake: link against cryptopp build: cmake: expose scylla_gen_build_dir build: cmake: remove stale TODO comments build: cmake: remove swagger_gen_files build: cmake: extract more subsystem out into its own CMakeLists.txt	2023-03-02 14:22:35 +02:00
Botond Dénes	e70be47276	Merge 'commitlog: Fix updating of total_size_on_disk on segment alloc when o_dsync is off' from Calle Wilund Fixes #12810 We did not update total_size_on_disk in commitlog totals when use o_dsync was off. This means we essentially ran with no registered footprint, also causing broken comparisons in delete_segments. Closes #12950 * github.com:scylladb/scylladb: commitlog: Fix updating of total_size_on_disk on segment alloc when o_dsync is off commitlog: change type of stored size	2023-03-02 12:39:11 +02:00
Botond Dénes	1b5f8916d6	Merge 'Generalize sstable::move_to_new_dir() method' from Pavel Emelyanov This method requires callers to remember that the sstable is the collection of files on a filesystem and to know what exact directory they are all in. That's not going to work for object storage, instead, sstable should be moved between more abstract states. This PR replaces move_to_new_dir() call with the change_state() one that accepts target sub-directory string and moves files around. Currently supported state changes: * staging -> normal * upload -> normal \| staging * any -> quarantine All are pretty straightforward and move files between table basedir subdirectories with the exception that upload -> quarantine should move into upload/quarantine subdirectory. Another thing to keep in mind, that normal state doesn't have its subdir but maps directory to table's base directory. Closes #12648 * github.com:scylladb/scylladb: sstable: Remove explicit quarantization call test: Move move_to_new_dir() method from sstable class sstable, dist.-loader: Introduce and use pick_up_from_upload() method sstables, code: Introduce and use change_state() call distributed_loader: Let make_sstables_available choose target directory	2023-03-02 09:22:14 +02:00
Kefu Chai	1fe180ffbe	build: cmake: add expr_test test Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-02 14:26:55 +08:00
Kefu Chai	29dc4b0da5	build: cmake: allow test to specify the sources some tests are compiled from more source files, so add an extra parameter, so they can customize the sources. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-02 14:26:55 +08:00
Kefu Chai	78773c2ebd	build: cmake: add test/perf due to circular dependency: the .cc files under the root of project references the symbols defined by the source files under subdirectories, but the source files under subdirectories also reference the symbols defined by the .cc files under the root of project, the targets in test/perf do not compile. but the general structure is created. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-02 10:15:25 +08:00
Kefu Chai	a51c928e69	build: cmake: move lib sources into test-lib less convoluted this way, so each target only includes the sources in its own directory. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-02 10:15:25 +08:00
Kefu Chai	40fb6ff728	build: cmake: add missing source to utils Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-02 10:15:25 +08:00
Kefu Chai	074281c450	build: cmake: link against cryptopp since we include cryptopp/ headers, we need find it and link against it explicitly, instead of relying on seastar to do this. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-02 10:15:25 +08:00
Kefu Chai	167d018ca7	build: cmake: expose scylla_gen_build_dir should have exposed the base directory of genereted headers, not the one with "rust" component. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-02 10:15:25 +08:00
Kefu Chai	47a06e76a2	build: cmake: remove stale TODO comments they have been addressed already. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-02 10:15:25 +08:00
Kefu Chai	1e040e0e12	build: cmake: remove swagger_gen_files which has been moved into api/CMakeLists.txt Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-02 10:15:25 +08:00
Kefu Chai	563fbb2d11	build: cmake: extract more subsystem out into its own CMakeLists.txt namely, cdc, compaction, dht, gms, lang, locator, mutation_writer, raft, readers, replica, service, tools, tracing and transport. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-02 10:15:25 +08:00
Aleksandra Martyniuk	24edcd27d4	test: extend major keyspace compaction tasks test	2023-03-01 18:56:31 +01:00
Aleksandra Martyniuk	b188060535	compaction: create task manager's task for major keyspace compaction on one shard Implementation of task_manager's task that covers major keyspace compaction on one shard.	2023-03-01 18:56:26 +01:00
Tomasz Grabiec	2d935e255a	storage_service: node ops: Extract node_ops_insert() to reduce code duplication	2023-03-01 18:43:13 +01:00
Tomasz Grabiec	d5021d5a1b	storage_service: Make node operations safer by detecting asymmetric abort This patch fixes a problem which affects decommission and removenode which may lead to data consistency problems under conditions which lead one of the nodes to unliaterally decide to abort the node operation without the coordinator noticing. If this happens during streaming, the node operation coordinator would proceed to make a change in the gossiper, and only later dectect that one of the nodes aborted during sending of decommission_done or removenode_done command. That's too late, because the operation will be finalized by all the nodes once gossip propagates. It's unsafe to finalize the operation while another node aborted. The other node reverted to the old topolgy, with which they were running for some time, without considering the pending replica when handling requests. As a result, we may end up with consistency issues. Writes made by those coordinators may not be replicated to CL replicas in the new topology. Streaming may have missed to replicate those writes depending on timing. It's possible that some node aborts but streaming succeeds if the abort is not due to network problems, or if the network problems are transient and/or localized and affect only heartbeats. There is no way to revert after we commit the node operation to the gossiper, so it's ok to close node_ops sessions before making the change to the gossiper, and thus detect aborts and prevent later aborts after the change in the gossiper is made. This is already done during bootstrap (RBNO enabled) and replacenode. This patch canges removenode to also take this approach by moving sending of remove_done earlier. We cannot take this approach with decommission easily, because decommission_done command includes a wait for the node to leave the ring, which won't happen before the change to the gossiper is made. Separating this from decommission_done would require protocol changes. This patch adds a second-best solution, which is to check if sessions are still there right before making a change to the gossiper, leaving decommission_done where it was. The race can still happen, but the time window is now much smaller. Fixes #12989 Refs #12969	2023-03-01 18:43:13 +01:00
Kefu Chai	d85af3dca4	dht: incremental_owned_ranges_checker: use lower_bound() instead of using a while loop for finding the lower_bound, just use std::lower_bound() for finding if current node owns given token. this has two advantages: * better readability: as lower_bound is exactly what this loop calculates. * lower_bound uses binary search for searching the element, this algorithm should be faster than linear under most circumstances. * lower_bound uses std::advance() and prefix increment operator, this should be more performant than the postfix increment operator. as it does not create an temporary instance of iterator. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13008	2023-03-01 11:29:46 +02:00
Avi Kivity	3042deb930	types: reimplement in terms of a variable template data_type_for() is a function template that converts a C++ type to a database dynamic type (data_type object). Instead of implementing a function per type, implement a variable template instance. This is shorter and nicer. Since the original type variables (e.g. long_type) are defined separately, use a reference instead of copying to avoid initialization order problems. To catch misuses of data_type_for the general data_type_for_v variable template maps to some unused tag type which will cause a build error when instantiated. The original motivation for this was to allow for partial specialization of data_type_for() for tuple types, but this isn't really workable since the native type for tuples is std::vector<data_value>, not std::tuple, and I only checked this after getting the work done, so this isn't helping anything; it's just a little nicer. Closes #13043	2023-03-01 11:25:39 +02:00
Botond Dénes	d5dee43be7	Merge 'doc: specify the versions where Alternator TTL is no longer experimental' from Anna Stuchlik This PR adds a note to the Alternator TTL section to specify in which Open Source and Enterprise versions the feature was promoted from experimental to non-experimental. The challenge here is that OSS and Enterprise are (still) documented together, but they're not in sync in promoting the TTL feature: it's still experimental in 5.1 (released) but no longer experimental in 2022.2 (to be released soon). We can take one of the following approaches: a) Merge this PR with master and ask the 2022.2 users to refer to master. b) Merge this PR with master and then backport to branch-5.1. If we choose this approach, it is necessary to backport https://github.com/scylladb/scylladb/pull/11997 beforehand to avoid conflicts. I'd opt for a) because it makes more sense from the OSS perspective and helps us avoid mess and backporting. Closes #12295 * github.com:scylladb/scylladb: doc: fix the version in the comment on removing the note doc: specify the versions where Alternator TTL is no longer experimental	2023-03-01 11:24:52 +02:00
Botond Dénes	92fde47261	Merge 'test/cql-pytest - aggregation tests' from Nadav Har'El This small series reorganizes the existing functional tests for aggregation (min, max, count) and adds additional tests for sum reproducing the strange (but Cassandra-compatible) behavior described in issue #13027. Closes #13038 * github.com:scylladb/scylladb: cql-pytest: add tests for sum() aggregate test/cql-pytest: move aggregation tests to one file	2023-03-01 11:02:08 +02:00
Avi Kivity	6822e3b88a	query_id: extract into new header query_id currently lives query-request.hh, a busy place with lots of dependencies. In turn it gets pulled by uuid.idl.hh, which is also very central. This makes test/raft/randomized_nemesis_test.cc which is nominally only dependent on Raft rebuild on random header file changes. Fix by extracting into a new header. Closes #13042	2023-03-01 10:25:25 +02:00
Botond Dénes	46efdfa1a1	Merge 'readers/nonforwarding: don't emit partition_end on next_partition,fast_forward_to' from Gusev Petr The series fixes the `make_nonforwardable` reader, it shouldn't emit `partition_end` for previous partition after `next_partition()` and `fast_forward_to()` Fixes: #12249 Closes #12978 * github.com:scylladb/scylladb: flat_mutation_reader_test: cleanup, seastar::async -> SEASTAR_THREAD_TEST_CASE make_nonforwardable: test through run_mutation_source_tests make_nonforwardable: next_partition and fast_forward_to when single_partition is true make_forwardable: fix next_partition flat_mutation_reader_v2: drop forward_buffer_to nonforwardable reader: fix indentation nonforwardable reader: refactor, extract reset_partition nonforwardable reader: add more tests nonforwardable reader: no partition_end after fast_forward_to() nonforwardable reader: no partition_end after next_partition() nonforwardable reader: no partition_end for empty reader row_cache: pass partition_start though nonforwardable reader	2023-03-01 09:58:14 +02:00
Botond Dénes	1c0b47ee9b	Merge 'treewide: remove unused variable and reference used one explicitly' from Kefu Chai - treewide: do not define/capture unused variables - sstables/sstables: mark dummy variable for loop [[maybe_unused]] - util/result_try: reference this explicitly - raft: reference this explicitly - idl-compiler: mark captured this used - build: reenable unused-{variable,lambda-capture} warnings Closes #12915 * github.com:scylladb/scylladb: build: reenable unused-{variable,lambda-capture} warnings test: reader_concurrency_semaphore_test: define target_memory in debug mode api::failure_detector: mark set_phi_convict_threshold unimplemented test: memtable_test: mark dummy variable for loop [[maybe_unused]] idl-compiler: mark captured this used raft: reference this explicitly util/result_try: reference this explicitly sstables/sstables: mark dummy variable for loop [[maybe_unused]] treewide: do not define/capture unused variables service: storage_service: clear _node_ops in batch	2023-03-01 09:44:37 +02:00
Nadav Har'El	363f326d49	test/cql-pytest: test for CLUSTERING ORDER BY verification in MV Since commit `73e258fc34`, Scylla has partial verification for the CLUSTERING ORDER BY clause in CREATE MATERIALIZED VIEW. Specifically, invalid column names are rejected. But for reasons explained in issue #12936 and in the test in this patch, Cassandra demands that if CLUSTERING ORDER BY appears it must list all the clustering columns, with no duplicates, and do so in the right order. This patch replaces an existing test which suggested it is fine (an extention over Cassandra) to accept a partial list of clustering columns, by a test that verifies that such a partial list, or an incorrectly-ordered list, or list with duplicates, should be rejected. The new test fails on Scylla, and passes on Cassandra, so marked as xfail. Refs #12936. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12938	2023-03-01 08:02:39 +02:00
Botond Dénes	84e26ed9c3	Merge 'Enable RBNO by default' from Asias He This pr fixes the seastar::rpc::closed_error error in the test_topology suite and enables RBNO by default. Closes #12970 * github.com:scylladb/scylladb: Revert "Revert "storage_service: Enable Repair Based Node Operations (RBNO) by default for all node ops"" storage_service: Wait for normal state handler to finish in replace storage_service: Wait for normal state handler to finish in bootstrap	2023-03-01 07:55:46 +02:00
Nadav Har'El	7dc54771e1	test/cql-pytest: allow "run-cassandra" without building Scylla Before this patch, all scripts which use test/cql-pytest/run.py looked for the Scylla executable as their first step. This is usually the right thing to do, except in two cases where Scylla is not needed: 1. The script test/cql-pytest/run-cassandra. 2. The script test/alternator/run with the "--aws" option. So in this patch we change run.py to only look for Scylla when actually needed (the find_scylla() function is called). In both cases mentioned above, find_scylla() will never get called and the script can work even if Scylla was never built. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13010	2023-03-01 07:54:19 +02:00
Botond Dénes	eb10623dd2	Merge 'build: cmake: sync with `configure.py` (7/n)' from Kefu Chai - build: cmake: support thrift < 0.11.0 - build: cmake: find Thrift before using it - build: cmake: find libxcrypt before using it - build: cmake: expose scylla_gen_build_dir from "interface" - build: cmake: extract thrift out - build: cmake: link the whole auth - build: cmake: extract mutation,db,replica,streaming out Closes #12990 * github.com:scylladb/scylladb: build: cmake: extract mutation,db,replica,streaming out build: cmake: link the whole auth build: cmake: extract thrift out build: cmake: expose scylla_gen_build_dir from "interface" build: cmake: find libxcrypt before using it build: cmake: find Thrift before using it build: cmake: support thrift < 0.11.0	2023-03-01 07:35:21 +02:00
Kefu Chai	f59542a01a	build: reenable unused-{variable,lambda-capture} warnings now that all -Wunused-{variable,lambda-capture} warnings are taken care of. let's reenable these warnings so they can help us to identify potential issues. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-01 10:45:18 +08:00
Kefu Chai	efe96e7fc6	test: reader_concurrency_semaphore_test: define target_memory in debug mode otherwise we'd have following warning ``` test/boost/reader_concurrency_semaphore_test.cc:1380:20: error: unused variable 'target_memory' [-Werror,-Wunused-const-variable] constexpr uint64_t target_memory = uint64_t(1) << 28; // 256MB ^ 1 error generated.` ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-03-01 10:45:18 +08:00
Kefu Chai	ffffcdb48a	cql3: mark cf_name final as `cf_name` is not derived from any class, it's viable to mark it `final`. this change is created to to silence the warning from Clang, like: ``` /home/kefu/.local/bin/clang++ -DDEBUG -DDEBUG_LSA_SANITIZER -DFMT_LOCALE -DFMT_SHARED -DHAVE_LZ4_COMPRESS_DEFAULT -DSANITIZE -DSCYLLA_BUILD_MODE=debug -DSCYLLA_ENABLE_ERROR_INJECTION -DSEASTAR_API_LEVEL=6 -DSEASTAR_DEBUG -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_TYPE_ERASE_MORE -DXXH_PRIVATE_API -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/cmake/gen -I/home/kefu/dev/scylladb/build/cmake -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/cmake/seastar/gen/include -Wall -Werror -Wno-mismatched-tags -Wno-missing-braces -Wno-c++11-narrowing -O0 -g -gz -std=gnu++20 -U_FORTIFY_SOURCE -DSEASTAR_SSTRING -Wno-error=unused-result "-Wno-error=#warnings" -fstack-clash-protection -fsanitize=address -fsanitize=undefined -fno-sanitize=vptr -MD -MT CMakeFiles/scylla.dir/data_dictionary/data_dictionary.cc.o -MF CMakeFiles/scylla.dir/data_dictionary/data_dictionary.cc.o.d -o CMakeFiles/scylla.dir/data_dictionary/data_dictionary.cc.o -c /home/kefu/dev/scylladb/data_dictionary/data_dictionary.cc In file included from /home/kefu/dev/scylladb/data_dictionary/data_dictionary.cc:9: In file included from /home/kefu/dev/scylladb/data_dictionary/data_dictionary.hh:11: /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/optional:287:2: error: destructor called on non-final 'cql3::cf_name' that has virtual functions but non-virtual destructor [-Werror,-Wdelete-non-abstract-non-virtual-dtor] _M_payload._M_value.~_Stored_type(); ^ /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/optional:318:4: note: in instantiation of member function 'std::_Optional_payload_base<cql3::cf_name>::_M_destroy' requested here _M_destroy(); ^ /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/optional:439:57: note: in instantiation of member function 'std::_Optional_payload_base<cql3::cf_name>::_M_reset' requested here _GLIBCXX20_CONSTEXPR ~_Optional_payload() { this->_M_reset(); } ^ /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/optional:514:17: note: in instantiation of member function 'std::_Optional_payload<cql3::cf_name>::~_Optional_payload' requested here constexpr _Optional_base() = default; ^ /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/optional:739:17: note: in defaulted default constructor for 'std::_Optional_base<cql3::cf_name>' first required here constexpr optional(nullopt_t) noexcept { } ^ /home/kefu/dev/scylladb/cql3/statements/raw/batch_statement.hh:37:28: note: in instantiation of member function 'std::optional<cql3::cf_name>::optional' requested here : cf_statement(std::nullopt) ^ /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/optional:287:23: note: qualify call to silence this warning _M_payload._M_value.~_Stored_type(); ^ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13039	2023-02-28 22:26:43 +02:00
Petr Gusev	1709a17c38	flat_mutation_reader_test: cleanup, seastar::async -> SEASTAR_THREAD_TEST_CASE	2023-02-28 23:42:44 +04:00
Petr Gusev	992ccb6255	make_nonforwardable: test through run_mutation_source_tests	2023-02-28 23:42:43 +04:00
Petr Gusev	989ef9d358	make_nonforwardable: next_partition and fast_forward_to when single_partition is true This flag designates that we should consume only one partition from the underlying reader. This means that attempts to move to another partition should cause an EOS.	2023-02-28 23:42:34 +04:00
Petr Gusev	a67776b750	make_forwardable: fix next_partition When next_partition is called, the buffer could contain partition_start and possibly static_row. In this case clear_buffer_to_next_partition will not remove anything from the buffer and the reader position should not change. Before this patch, however, we used to set _end_of_stream=false, which violated the forwardable-reader contract - the data of the next partition was emitted after the data of the first partition without intermediate EOS. This bug was found when debugging test_make_nonforwardable_from_mutations_as_mutation_source flakiness. A corresponding focused test_make_forwardable_next_partition has been added to exercise this problem.	2023-02-28 23:11:45 +04:00
Petr Gusev	64427b9164	flat_mutation_reader_v2: drop forward_buffer_to This is just a strange method I came across. It effectively does nothing but clear_buffer().	2023-02-28 23:00:02 +04:00
Petr Gusev	a517e1d6ad	nonforwardable reader: fix indentation	2023-02-28 23:00:02 +04:00
Petr Gusev	beeffb899f	nonforwardable reader: refactor, extract reset_partition No observable behaviour changes, just refactor the code.	2023-02-28 23:00:02 +04:00
Petr Gusev	023ed0ad00	nonforwardable reader: add more tests Add more test cases for completeness.	2023-02-28 23:00:02 +04:00
Petr Gusev	88cd1c3700	nonforwardable reader: no partition_end after fast_forward_to() This patch fixes the problem with method fast_forward_to which is similar to the one with next_partition, no partition_end should be injected for the partition if fast_forward_to was called inside it.	2023-02-28 23:00:02 +04:00
Petr Gusev	8ff96e1bce	nonforwardable reader: no partition_end after next_partition() Before the patch, nonforwardable reader injected partition_end unconditionally. This caused problems in case next_partition() was called, the downstream reader might have already injected its own partition_end marker, and the one from nonforwardable reader was a duplicate. Fixes: #12249	2023-02-28 23:00:02 +04:00
Petr Gusev	9c5c380b0b	nonforwardable reader: no partition_end for empty reader The patch introduces the _partition_is_open flag, inject partition_end only if there was some data in the input reader. A simple unit test has been added for the nonforwardable reader which checks this new behaviour.	2023-02-28 22:59:56 +04:00
Wojciech Mitros	6d2e785b5c	docs: update wasm.md The WASM UDF implementation has changed since the last time the docs were written. In particular, the Rust helper library has been released, and using it should be the recommended method. Some decisions that were only experimental at the start, were also "set in stone", so we should refer to them as such. The docs also contain some code examples. This patch adds tests for these examples to make sure that they are not wrong and misleading. Closes #12941	2023-02-28 20:59:25 +02:00
Kefu Chai	2434a4d345	utils: small_vector: define operator<=> small_vector should be feature-wise compatible with std::vector<>, let's add operator<=> for it. also, there is not needd to define operator!=() explicitly, C++20 define this for us if operator==() is defined, so let's drop it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13032	2023-02-28 20:04:22 +02:00
Benny Halevy	06a0902708	dht/range_streamer: stream_async: move ranges_to_stream to do_streaming Currently the ranges_to_stream variable lives on the caller state, and do_streaming() moves its contents down to request_ranges/transfer_ranges and then calls clear() to make it ready for reuse. This works in principle but it makes it harder for an occasional reader of this code to figure out what going on. This change transfers control of the ranges_to_stream vector to do_streaming, by calling it with (std::exchange(do_streaming, {})) and with that that moved vector doesn't need to be cleared by do_streaming, and the caller is reponsible for readying the variable for reuse in its for loop. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-28 17:38:34 +02:00
Benny Halevy	1392c7e1cf	streaming: stream_session: maybe_yield To prevent reactor stalls when freeing many/long token range vectors. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-28 17:32:44 +02:00
Avi Kivity	20e1908c55	Merge 'treewide: use (defaulted) operator<=> when appropriate' from Kefu Chai - db/view: use operator<=> to define comparison operators - utils: UUID: use defaulted operator<=> - db: schema_tables: use defaulted operator<=> - cdc: generation: schema_tables: use defaulted operator<=> - db::commitlog::replay_position: use defaulted operator<=> Closes #13033 * github.com:scylladb/scylladb: db::commitlog::replay_position: use defaulted operator<=> cdc: generation: schema_tables: use defaulted operator<=> db: schema_tables: use defaulted operator<=> utils: UUID: use defaulted operator<=> db/view: use operator<=> to define comparison operators	2023-02-28 17:05:45 +02:00
Benny Halevy	c4836ab9e9	streaming: stream_session: prepare: move token ranges to add_transfer_ranges Reduce copies on the path to calling add_transfer_ranges. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-28 17:04:47 +02:00
Benny Halevy	12eb3d210f	streaming: stream_plan: transfer_ranges: move token ranges towards add_transfer_ranges Rather than copying the ranges vector. Note that add_transfer_ranges itself cannot simply move the ranges since it copies them for multiple tables. While at it, move also the keyspace and column_family strings. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-28 17:03:51 +02:00
Benny Halevy	775c6b9697	dht/range_streamer: stream_async: do_streaming: move ranges downstream The ranges can be moved rather than copied to both `request_ranges` and `transfer_ranges` as they are only cleared after this point. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-28 16:56:55 +02:00
Benny Halevy	3cd8838a09	dht/range_streamer: add_ranges: clear_gently ranges_for_keyspace After calling get_range_fetch_map, ranges_for_keyspace is not used anymore. Synchronously destroying it may potentially stall in large clusters so use utils::clear_gently to gently clear the map. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-28 16:52:30 +02:00
Benny Halevy	a80c2d16dd	dht/range_streamer: get_range_fetch_map: reduce copies Use const& to refer to the input ranges and endpoints rather than copying them individually along the way more than needed to. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-28 16:52:30 +02:00
Benny Halevy	9d6e5d50d1	dht/range_streamer: add_ranges: move ranges down-stream Eliminate extraneous copy. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-28 16:52:27 +02:00
Benny Halevy	c61f058aa5	dht/boot_strapper: move ranges to add_ranges Eliminate extraneous copy. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-28 16:50:40 +02:00
Benny Halevy	27b382dcce	dht/range_streamer: stream_async: incrementally update _nr_ranges_remaining Rather than calling nr_ranges_to_stream() inside `do_streaming`. As nr_ranges_to_stream depends on the `_to_stream` that will be updated only later on after the next patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-28 16:50:40 +02:00
Benny Halevy	c3c7efffb1	dht/range_streamer: stream_async: erase from range_vec only after do_streaming success range_vec is used for calculating nr_ranges_to_stream. Currently, the ranges_to_stream that were moved out of range_vec are push back on exception, but this isn't safe, since they may have moved already to request_ranges or transfer_ranges. Instead, erase the ranges we pass to do_streaming only after it succeeds so on exception, range_vec will not need adjusting. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-28 16:50:40 +02:00
Kefu Chai	7de2d1c714	api::failure_detector: mark set_phi_convict_threshold unimplemented let it throw if "set_phi_convict_threshold" is called, as we never populate the specified \Phi. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 21:56:55 +08:00
Kefu Chai	60eac12db6	test: memtable_test: mark dummy variable for loop [[maybe_unused]] without C++23 `std::ranges::repeat_view`, it'd be cumbersume to implement a loop without dummy variable. this change helps to silence following warning: ``` test/boost/memtable_test.cc:1135:26: error: unused variable 'value' [-Werror,-Wunused-variable] for (int value : boost::irange<int>(0, num_flushes)) { ^ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 21:56:55 +08:00
Kefu Chai	2caf9b4e1c	idl-compiler: mark captured this used sometime the captured `this` is used in the generated C++ code, while some time it is not. to reenable `-Wunused-lambda-capture` warning, let's mark this `this` as used. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 21:56:55 +08:00
Kefu Chai	b926105eae	raft: reference this explicitly Clang complains that the captured `this` is not used, like ``` /home/kefu/dev/scylladb/raft/fsm.hh:644:21: error: lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture] auto visitor = [this, from, msg = std::move(msg)](const auto& state) mutable { ^ /home/kefu/dev/scylladb/raft/server.cc:738:11: note: in instantiation of function template specialization 'raft::fsm::step<raft::append_request>' requested here _fsm->step(from, std::move(append_request)); ^ ``` but `step(..)` is a non-static member function of `fsm`, so `this` is actually used. to silence Clang's warning, let's just reference it explicitly. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 21:56:55 +08:00
Kefu Chai	5e7c8cc4b7	util/result_try: reference this explicitly quote from Avi's comment > It's supposed to be illegal to call handle(...) without this->, > because handle() is a dependent name (but many compilers don't > insist, gcc is stricter here). So two error messages competed, > and "unused this capture" won. without this change, Clang complains that `this` is not used with `-Wunused-lambda-capture`. in this change, `this` is used. in this change, `this` is explicitly referenced to silence Clang's warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 21:56:55 +08:00
Kefu Chai	1171c326a9	sstables/sstables: mark dummy variable for loop [[maybe_unused]] without C++23 `std::ranges::repeat_view`, it'd be cumbersume to implement a loop without dummy variable ``` /home/kefu/dev/scylladb/sstables/sstables.cc:484:15: error: unused variable '_' [-Werror,-Wunused-variable] for (auto _ : boost::irange<key_type>(0, nr_elements)) { ^ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 21:56:55 +08:00
Kefu Chai	3ae11de204	treewide: do not define/capture unused variables these warnings are found by Clang-17 after removing `-Wno-unused-lambda-capture` and '-Wno-unused-variable' from the list of disabled warnings in `configure.py`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 21:56:53 +08:00
Kefu Chai	be47874a42	service: storage_service: clear _node_ops in batch before this change, _node_ops are cleared one after another in `storage_service::node_ops_abort()` when `ops_uuid` is not specified. but this * is not efficient * is not quite readable * introduces an unused variable so, in this change, we just clear it in batch. this should silence a `-Wno-unused-variable` warning from Clang. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 21:52:25 +08:00
Nadav Har'El	130c090251	cql-pytest: add tests for sum() aggregate This patch adds regression tests for the strange (but Cassandra-compatible) behavior described in issue #13027 - that sum of no results returns 0 (not null or nothing), and if also asking for p, we get a null there too. Refs #13027. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-02-28 15:35:21 +02:00
Botond Dénes	6b72f4a6fa	Merge 'main: display descriptions of all tools' from Kefu Chai - main: expose tools as a vector<> - main: use a struct for representing tool - main: track tools descriptin in tool struct - main: add missing descriptions for tools - main: move get_tools() into main() Fixes #13026 Closes #13030 * github.com:scylladb/scylladb: main: move get_tools() into main() main: add missing descriptions for tools main: track tools descriptin in tool struct main: use a struct for representing tool main: expose tools as a vector<>	2023-02-28 15:32:11 +02:00
Kefu Chai	af3968bf6e	build: cmake: extract mutation,db,replica,streaming out Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 21:28:46 +08:00
Kefu Chai	6f3a44cde9	build: cmake: link the whole auth without this change, linker would like to remove the .o which is not referenced by auther translation units. but we do use static variables to, for instance, register classess to a global registry. so, let's force the linker to include the whole archive. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 21:28:46 +08:00
Kefu Chai	3e75df6917	build: cmake: extract thrift out also, move "interface" linkage from scylla to "thrift", because it is "thrift" who is using "interface". Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 21:28:46 +08:00
Kefu Chai	4bb0134f1d	build: cmake: expose scylla_gen_build_dir from "interface" as it builds headers like "gen/Cassandra.h", and the target uses "interface" via these headers, so "interface" is obliged to expose this include directory. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 21:28:46 +08:00
Kefu Chai	1aafeac023	build: cmake: find libxcrypt before using it we should find libxcrypt library before using it. in this change, Findlibxcrypt.cmake is added to find libxcrypt library. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 21:28:46 +08:00
Kefu Chai	607858db51	build: cmake: find Thrift before using it we should find Thrift library before using it. in this change, FindThrift.cmake is added to find Thrift library. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 21:28:46 +08:00
Kefu Chai	f30e7f9da1	build: cmake: support thrift < 0.11.0 define THRIFT_USES_BOOST if thrift < 0.11.0, see also #4538 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 21:28:46 +08:00
Nadav Har'El	e1f97715eb	test/cql-pytest: move aggregation tests to one file We had separate test files test_minmax.py and test_count.py but the separate was artificial (and test_count.py even had one test using min()). Now I that want to add another test for sum(), I don't know where to put it. So in this patch I combine test_minmax.py and test_count.py into one test file - test_aggregate.py, and we can later add sum() tests in the same file. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-02-28 14:39:04 +02:00
Kefu Chai	67b334385c	dist/redhat: specify version in `Obsoletes:` to silence the warning from rpmbuild, like ``` RPM build warnings: line 202: It's not recommended to have unversioned Obsoletes: Obsoletes: tuned ``` more specific this way. quote from the commit message of `303865d979` for the version number: > tuned 2.11.0-9 and later writes to kerned.sched_wakeup_granularity_ns > and other sysctl tunables that we so laboriously tuned, dropping > performance by a factor of 5 (due to increased latency). Fix by > obsoleting tuned during install (in effect, we are a better tuned, > at least for us). with this change, it'd be easier to identify potential issues when building / packaging. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12721	2023-02-28 13:55:04 +02:00
Marcin Maliszkiewicz	bd7caefccf	docs: link general repairs page to RBNO page Information was duplicated before and the version on this page was outdated - RBNO is enabled for replace operation already. Closes #12984	2023-02-28 13:04:32 +02:00
Tomasz Grabiec	fddd93da4e	storage_service: node ops: Add error injections	2023-02-28 11:32:18 +01:00
Tomasz Grabiec	5c8ad2db3c	service: node_ops: Make watchdog and heartbeat intervals configurable Will be useful for writing tests which trigger failures, and for warkarounds in production.	2023-02-28 11:31:55 +01:00
Kefu Chai	5bf6e9ba97	db::commitlog::replay_position: use defaulted operator<=> the default generated operator<=> is exactly the same as the handcrafted one. so let compiler do its job. also, since operator<=> is defaulted, there is no need to define operator== anymore, so drop it as well. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 17:25:30 +08:00
Kefu Chai	aed681fa3c	cdc: generation: schema_tables: use defaulted operator<=> the default generated operator<=> is exactly the same as the handcrafted one. so let compiler do its job. also, since operator<=> is defaulted, there is no need to define operator== anymore, so drop it as well. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 17:25:30 +08:00
Kefu Chai	56c9c9d29e	db: schema_tables: use defaulted operator<=> the default generated operator<=> is exactly the same as the handcrafted one. so let compiler do its job. also, since operator<=> is defaulted, there is no need to define operator== anymore, so drop it as well. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 17:25:30 +08:00
Kefu Chai	9ec8b4844b	utils: UUID: use defaulted operator<=> the default generated operator<=> is exactly the same as the handcrafted one. so let compiler do its job. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 17:25:30 +08:00
Kefu Chai	ab5d772d63	db/view: use operator<=> to define comparison operators also, there is no need to define operator!=() if operator==() is defined, so drop it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 17:25:30 +08:00
Kefu Chai	7550be1fc6	main: move get_tools() into main() there is not need to have a dedicated function which is only consumed by `main()`. so let's move the body of `get_tools()` into `main`. and with this change, a plain C array would suffice. so just use a plain array for tools. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 17:09:46 +08:00
Kefu Chai	128dbebb76	main: add missing descriptions for tools Fixes #13026 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 17:09:46 +08:00
Kefu Chai	ef0dfeb2fa	main: track tools descriptin in tool struct so we can manage the tools in a more structured way. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 17:09:46 +08:00
Kefu Chai	ffbbd59486	main: use a struct for representing tool so we can encapsulate the description of a certain tool in this struct with a more readable field name in comparison with a tuple<>, if we want to track all tools in this vector. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 17:09:46 +08:00
Kefu Chai	73cf62469b	main: expose tools as a vector<> so, in addition to looking up a tool by the name in it, we will be able to list all tools in this vector. this change paves the road to a more general solution to handle `--list-tools`. in this change * `lookup_main_func()` is replaced by `get_tools()`. * instead of checking `main_func` out of the if block, check it in the `if` block. as we already know if we have a matched tool in the `if` block, and we can early return right there. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-28 17:09:46 +08:00
Kefu Chai	991379bdb3	raft: broadcast_tables: remove unused asyncio mark test_broadcast_kv_store does not use await or yield at all, so there is no need to mark it with "asyncio" mark. tested using ``` SCYLLA_HOME=$HOME/scylla build/cmake/scylla --overprovisioned --developer-mode=yes --consistent-cluster-management=true --experimental-features=broadcast-tables ... pytest broadcast_tables/test_broadcast_tables.py ``` the test still passes. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13006	2023-02-28 11:05:15 +02:00
Asias He	8fb786997a	Revert "Revert "storage_service: Enable Repair Based Node Operations (RBNO) by default for all node ops"" This reverts commit `fd4ee4878a`.	2023-02-28 09:00:13 +08:00
Asias He	5856e69462	storage_service: Wait for normal state handler to finish in replace Similar to "storage_service: Wait for normal state handler to finish in bootstrap", this patch enables the check on the replace procedure.	2023-02-28 09:00:13 +08:00
Asias He	53636167ca	storage_service: Wait for normal state handler to finish in bootstrap In storage_service::handle_state_normal, storage_service::notify_joined will be called which drops the rpc connections to the node becomes normal. This causes rpc calls with that node fail with seastar::rpc::closed_error error. Consider this: - n1 in the cluster - n2 is added to join the cluster - n2 sees n1 is in normal status - n2 starts bootstrap process - notify_joined on n2 closes rpc connection to n1 in the middle of bootstrap - n2 fails to bootstrap For example, during bootstrap with RBNO, we saw repair failed in a test that sets ring_delay to zero and does not wait for gossip to settle. repair - repair[9cd0dbf8-4bca-48fc-9b1c-d9e80d0313a2]: sync data for keyspace=system_distributed_everywhere, status=failed: std::runtime_error ({shard 0: seastar::rpc::closed_error (connection is closed)}) This patch fixes the race by waiting for the handle_state_normal handler to finish before the bootstrap process. Fixes #12764 Fixes #12956	2023-02-28 09:00:13 +08:00
Kefu Chai	b6e4275511	configure.py: build and use libseastar.so in debug and dev modes now that Seastar can be built as shared libraries, we can use it for faster development iteration with less disk usage. in this change * configure.py: - 'build_seastar_shared_libs' is added as yet another mode value, so different modes have its own setting. 'debug' and 'dev' have this enabled, while other modes disable it. - link scylla with rpath specified, so it can find `libseastar.so` in build directory. * install.sh: remove the rpath as the rpath in the elf image will not be available after the relocatable package is installed, also rpmbuild will error out when it uses check-rpaths to verify the elf images (executables and shared libraries), as the rpath encoded in them are not known ones. patchelf() will take care of the shared libraries linked by the executables. so we don't need to worry about libseastar.so or libseastar_testing.so. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12801	2023-02-27 21:08:34 +02:00
Kefu Chai	4f3bc915a6	cql-pytest: remove duplicated words in README.md Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13005	2023-02-27 17:28:32 +02:00
Nadav Har'El	3b32440993	test/cql-pytest: add regression test for UNSET key in insert Recently, we overhauled the error handling of UNSET_VALUE in various places where it is not allowed. This patch adds two more regression tests for this error handling. Both tests pass on Scylla today, pass on Cassandra, but fail on earlier Scylla (e.g., I tested 5.1.5): The first test does INSERT into clustering key UNSET_VALUE. An UNSET_VALUE is designed to skip part of the write - not an entire write - so this attempt should fail - not silently be skipped. The write indeed fails with an error on Cassandra, and on recent Scylla, but silently did nothing in older Scylla which leads this test to fail there. The second test does the same thing with LWT (adding an "IF NOT EXISTS") added to the insert. Scylla's failure here was even more spectacular - it crashed (as reported in issue #13001) instead of silently skipping the right. The test passes on Scylla today and on Cassandra, which both report the failure cleanly. Refs #13001. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13007	2023-02-27 17:20:22 +02:00
Petr Gusev	a46df5af63	row_cache: pass partition_start though nonforwardable reader Now the nonforwardable reader unconditionally produces a partition_end, even if the input reader was empty. This is strange in itself, but it also hinders to properly fix its next_partition() method, which is our ultimate goal. So we are going to change this and produce partition_end only if there were some data in the stream. However, this makes a problem: now we pop partition_start from the underlying reader in autoupdating_underlying_reader::move_to_next_partition and manually push it back to downstream readers bypassing nonforwardable reader. This means if we change the logic in nonforwardable reader as described we will end up with partition_start without partition_end in the downstream readers. This patch rectifies this by making sure that nonforwardable will see the initial partition_start. We inject this partition_start just before the nonforwardable reader, into delegating_reader. This also makes the result type of range_populating_reader::operator() a bit simpler, we don't need to pass partition_start anymore.	2023-02-27 18:46:31 +04:00
Nadav Har'El	73e258fc34	materialized views: verify CLUSTERING ORDER BY clause Cassandra is very strict in the CLUSTERING ORDER BY clause which it allows when creating a materialized view - if it appears, it must list all the clustering columns of the view. Scylla is less strict - a subset of the clustering columns may be specified. But Scylla was too lenient - a user could specify non-clustering columns and even non-existent columns and Scylla would not fail the MV creation. This patch fixes that - with it MV creation fails if anything besides clustering columns are listed on CLUSTERING ORDER BY. An xfailing test we had for this case no longer fails after this patch so its xfail mark is removed. We also add a few more corner cases to the tests. This patch also fixs one C++ test which had exactly the error that this patch detects - the test author tried to use the partition key, instead of the clustering key, in CLUSTERING ORDER BY (this error had no effect because the specified order, "asc", was the default anyway). Fixes #10767 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12885	2023-02-27 15:09:42 +02:00
Kefu Chai	7fd303044e	tools/schema_loader: drop unused functions `load_one_schema()` and `load_schemas_from_file()` are dropped, as they are neither used by `scylla-sstable` or tested by `schema_loader_test.cc` . the latter tests `load_schemas()`, which is quite the same as `load_one_schema_from_file()`, but is more permissive in the sense that it allows zero schema or more than one schema in the specified path. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13003	2023-02-27 13:03:05 +02:00
Avi Kivity	6f88dc8009	Merge 'Fix memory leaks caused by throwing reader_concurrency_semaphore::consume()' from Botond Dénes Said method can now throw `std::bad_alloc` since `aab5954`. All call-sites should have been adapted in the series introducing the throw, but some managed to slip through because the oom unit test didn't run in debug mode. This series fixes the remaining unpatched call-sites and makes sure the test runs in debug mode too, so leaks like this are detected. Fixes: #12767 Closes #12756 * github.com:scylladb/scylladb: test/boost/reader_concurreny_semaphore_test: run oom protection tests in debug mode treewide: adapt to throwing reader_concurrency_semaphore::consume()	2023-02-27 12:27:30 +02:00
Anna Stuchlik	91b611209f	doc: fixes https://github.com/scylladb/scylladb/issues/12954 , adds the minimal version from which the 2021.1-to-2022.1 upgrade is supported for Ubuntu, Debian, and image Closes #12974	2023-02-27 12:15:49 +02:00
David Garcia	20bff2bd10	docs: Update ScyllaDB Enterprise link Closes #12985	2023-02-27 08:39:50 +02:00
Anna Stuchlik	95ce2e8980	doc: fix the option name LWT_OPTIMIZATION_META_BIT_MASK Fixes #12940. Closes #12982 [avi: move fixes tag out of subject]	2023-02-26 19:51:20 +02:00
Avi Kivity	c863186dc5	Merge 'Fixes for docs/dev/building.md' from Kamil Braun Closes #12071 * github.com:scylladb/scylladb: docs/dev: building.md: mention node-exporter packages docs/dev: building.md: replace `dev` with `<mode>` in list of debs	2023-02-26 19:27:33 +02:00
Kefu Chai	410035f03d	abstract_replication_strategy: remove unnecessary `virtual` specifier `effective_replication_map` is not a base class of any other class. so there is no need to mark any of its member function as `virtual`. this change should address following waring from Clang: ``` /home/kefu/dev/scylladb/seastar/include/seastar/core/shared_ptr.hh:205:9: error: delete called on non-final 'locator::effective_replication_map' that has virtual functions but non-virtual destructor [-Werror,-Wdelete-non-abstract-non-virtual-dtor] delete value_ptr; ^ /home/kefu/dev/scylladb/seastar/include/seastar/core/shared_ptr.hh:202:9: note: in instantiation of member function 'seastar::internal::lw_shared_ptr_accessors_esft<locator::effective_replication_map>::dispose' requested here dispose(static_cast<T*>(counter)); ^ /home/kefu/dev/scylladb/seastar/include/seastar/core/shared_ptr.hh:317:27: note: in instantiation of member function 'seastar::internal::lw_shared_ptr_accessors_esft<locator::effective_replication_map>::dispose' requested here accessors<T>::dispose(_p); ^ /home/kefu/dev/scylladb/locator/abstract_replication_strategy.hh:263:12: note: in instantiation of member function 'seastar::lw_shared_ptr<locator::effective_replication_map>::~lw_shared_ptr' requested here return make_lw_shared<effective_replication_map>(std::move(rs), std::move(tmptr), std::move(replication_map), replication_factor); ^ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12992	2023-02-26 19:16:28 +02:00
Kefu Chai	79d2eb1607	cql3: functions: validate arguments for 'token()' also since "token()" computes the token for a given partition key, if we pass the key of the wrong type, it should reject. in this change, * we validate the keys before returning the "token()" function. * drop the "xfail" decorator from two of the tests. they pass now after this fix. * change the tests which previously passed the wrong number of arguments containing null to "token()" and expect it to return null, so they verify that "token()" should reject these arguments with the expected error message. Fixes #10448 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12991	2023-02-26 19:01:58 +02:00
Gleb Natapov	1ce7ad1ee6	lwt: do not destroy capture in upgrade_if_needed lambda since the lambda is used more then once If on the first call the capture is destroyed the second call may crash. Fixes: #12958 Message-Id: <Y/sks73Sb35F+PsC@scylladb.com>	2023-02-26 16:13:16 +02:00
Kefu Chai	f3e6c9168c	sstables: generation_type: define fmt::formatter for generation_type turns out what we need is a fmt::formatter<sstables::generation_type> not operator<<(ostream&, sstables::generation_type), as its only use case is the formatter used by seastar::format(). to specialize fmt::formatter<sstables::generation_type> * allows us to be one step closer to drop `FMT_DEPRECATED_OSTREAM` * allows us to customize the way how generation_type is printed by customizing the format specifier. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12983	2023-02-26 15:38:10 +02:00
Avi Kivity	8a0a784131	Merge 'utils: UUID: use default generated comparison operators' from Kefu Chai - utils: UUID: define operator<=> for UUID - utils: UUID: define operator==() only Closes #12981 * github.com:scylladb/scylladb: utils: UUID: define operator==() only utils: UUID: define operator<=> for UUID	2023-02-26 15:31:46 +02:00
Piotr Smaroń	c1760af26c	cql3: adding missing privileged on cache size eviction metric Fixes #10463 Closes #12865	2023-02-26 14:33:46 +02:00
Kefu Chai	1c71151eda	utils: UUID: define operator==() only as, in C++20, compiler is able to generate the operator==() for us, and the default generated one is identical to what we have now. also, in C++20, operator!=() is generated by compiler if operator==() is defined, so we can dispense with the former. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-25 09:36:11 +08:00
Kefu Chai	300e0b1d1c	utils: UUID: define operator<=> for UUID instead of the family of comparison operators, just define <=>. as in C++20, compiler will define all six comparison operators for us. in this change, the operator<=> is defined, so we can more compacted code. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-25 09:36:11 +08:00
Asias He	ba919aa88a	storage_service: Send heartbeat earlier for node ops Node ops has the following procedure: 1 for node in sync_nodes send prepare cmd to node 2 for node in sync_nodes send heartbeat cmd to node If any of the prepare cmd in step 1 takes longer than the heartbeat watchdog timeout, the heartbeat in step 2 will be too late to update the watchdog, as a result the watchdog will abort the operation. To prevent slow prepare cmd kills the node operations, we can start the heartbeat earlier in the procedure. Fixes #11011 Fixes #12969 Closes #12980	2023-02-24 22:31:40 +01:00
Botond Dénes	61e67b865a	Merge 'service:forward_service: use long type instead of counter in function mocking' from Michał Jadwiszczak Aggregation query on counter column is failing because forward_service is looking for function with counter as an argument and such function doesn't exist. Instead the long type should be used. Fixes: #12939 Closes #12963 * github.com:scylladb/scylladb: test:boost: counter column parallelized aggregation test service:forward_service: use long type when column is counter	2023-02-24 15:25:10 +02:00
Raphael S. Carvalho	d73ffe7220	sstables: Temporarily disable loading of first and last position metadata It's known that reading large cells in reverse cause large allocations. Source: https://github.com/scylladb/scylladb/issues/11642 The loading is preliminary work for splitting large partitions into fragments composing a run and then be able to later read such a run in an efficiency way using the position metadata. The splitting is not turned on yet, anywhere. Therefore, we can temporarily disable the loading, as a way to avoid regressions in stable versions. Large allocations can cause stalls due to foreground memory eviction kicking in. The default values for position metadata say that first and last position include all clustering rows, but they aren't used anywhere other than by sstable_run to determine if a run is disjoint at clustering level, but given that no splitting is done yet, it does not really matter. Unit tests relying on position metadata were adjusted to enable the loading, such that they can still pass. Fixes #11642. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #12979	2023-02-24 12:14:18 +02:00
Michał Jadwiszczak	4c6675bf1a	test:boost: counter column parallelized aggregation test	2023-02-24 10:24:23 +01:00
Michał Jadwiszczak	68d2e1fff8	service:forward_service: use long type when column is counter Previously aggregations on counter columns were failing because function mocking was looking for function with counter arguemnt, which doesn't exist.	2023-02-24 10:24:16 +01:00
Botond Dénes	be232ff024	Merge 'Shard of shard repair task impl' from Aleksandra Martyniuk Shard id is logged twice in repair (once explicitly, once added by logger). Redundant occurrence is deleted. shard_repair_task_impl::id (which contains global repair shard) is renamed to avoid further confusion. Fixes: #12955 Closes #12959 * github.com:scylladb/scylladb: repair: rename shard_repair_task_impl::id repair: delete redundant shard id from logs	2023-02-24 08:43:54 +02:00
Botond Dénes	80f653d65e	Merge 'Major keyspace compaction task' from Aleksandra Martyniuk Task manager task implementation that covers the major keyspace compaction which can be start through /storage_service/keyspace_compaction/ api. Closes #12661 * github.com:scylladb/scylladb: test: add test for major keyspace compaction tasks compaction: create task manager's task for major keyspace compaction compaction: copy run_on_existing_tables to task_manager_module.cc compaction: add major_compaction_task_impl compacition: add pure virtual compaction_task_impl compaction: add compaction module getter to compaction manager	2023-02-24 07:08:06 +02:00
Guy Shtub	c47b7c4cb2	Replacing user-group with community forum, added link to U. lesson on Spring Boot Fixed author/email details Closes #12748	2023-02-23 19:05:26 +02:00
Aleksandra Martyniuk	e9f01c7cce	test: add test for major keyspace compaction tasks	2023-02-23 15:48:25 +01:00
Aleksandra Martyniuk	159e603ac4	compaction: create task manager's task for major keyspace compaction Implementation of task_manager's task covering major keyspace compaction that can be started through storage_service api.	2023-02-23 15:48:05 +01:00
Aleksandra Martyniuk	6b1d7f5979	compaction: copy run_on_existing_tables to task_manager_module.cc Copy run_on_existing_tables from api/storage_service.cc to compaction/task_manager_module.cc	2023-02-23 15:31:59 +01:00
Anna Stuchlik	4dd1659d0b	doc: fixes https://github.com/scylladb/scylladb/issues/12964 , removes the information that the CDC options are experimental Closes #12973	2023-02-23 15:06:53 +02:00
Kefu Chai	412953fdd5	compress, transport: do not detect LZ4_compress_default() `LZ4_compress_default()` was introduced in liblz4 v1.7.3, despite that the release note (https://github.com/lz4/lz4/releases/tag/v1.7.3) of v1.7.3 didn't mention this. if we check the commit which added this API, we can find all releases including it: see ``` $ git tag --contains 1b17bf2ab8cf66dd2b740eca376e2d46f7ad7041 lz4-r130 r129 r130 r131 rc129v0 v1.7.3 v1.7.4 v1.7.4.2 v1.7.5 v1.8.0 v1.8.1 v1.8.1.2 v1.8.2 v1.8.3 v1.9.0 v1.9.1 v1.9.2 v1.9.3 v1.9.4 ``` and v1.7.3 was released in Nov 17, 2016. some popular distros releases also package new enough liblz4: - fedora 35 ships lz4-devel 1.9.3, - CentOS 7 ships lz4-devel 1.8.3 - debian 10 ships liblz4-dev 1.8.3 - ubuntu 18.04 ships liblz4-dev r131 so, in this change, we drop the support of liblz4 < 1.7.3 for better code readability. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12971	2023-02-23 14:39:20 +02:00
Pavel Emelyanov	0959739216	sstables: Remove always-false sstable_writer_config::leave_unsealed It was used in sstables streaming code up until `e5be3352` (database, streaming, messaging: drop streaming memtables) or nearby, then the whole feature was reworked. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #12967	2023-02-23 12:50:06 +01:00
Botond Dénes	624d176b3b	Merge 'Refine usage of sstable_test_env::reusable_sst() method' from Pavel Emelyanov Some test cases can be made a bit more compact by using the sugar provided by the aforementioned sugar Closes #12965 * github.com:scylladb/scylladb: test: Make use of reusable_sst default format tests: Use reusable_sst() where applicable	2023-02-23 12:50:06 +01:00
Botond Dénes	a5979c0662	Merge 'treewide: remove invalid defaulted move ctor' from Kefu Chai - test/boost/chunked_vector_test: remove defaulted exception_safe_class's move ctor - tools/scylla-sstable: remove defaulted move ctor - sstables/mx/partition_reversing_data_source: remove defaulted move ctor - cql3/statements/truncate_statement: remove defaulted move ctor Closes #12914 * github.com:scylladb/scylladb: test/boost/chunked_vector_test: remove defaulted exception_safe_class's move ctor tools/scylla-sstable: remove defaulted move ctor sstables/mx/partition_reversing_data_source: remove defaulted move ctor cql3/statements/truncate_statement: remove defaulted move ctor	2023-02-23 12:50:05 +01:00
Avi Kivity	665429d85b	cql3: remove assignment_testable::test_all Was replaced with cql3::expr::test_assignment_all(). Closes #12951	2023-02-23 12:50:05 +01:00
Botond Dénes	0c756af137	Merge 'build: cmake: sync with `configure.py` (6/n)' from Kefu Chai - build: cmake: correct linker flags - build: cmake: enable boost tests only if BUILD_TESTING - build: cmake: reuse test-lib library - build: cmake: extract redis out Closes #12961 * github.com:scylladb/scylladb: build: cmake: extract interface out build: cmake: extract redis out build: cmake: reuse test-lib library build: cmake: enable boost tests only if BUILD_TESTING build: cmake: correct linker flags	2023-02-23 12:50:05 +01:00
Aleksandra Martyniuk	d889a599e8	repair: rename shard_repair_task_impl::id shard_repair_task_impl::id stores global repair id. To avoid confusion with the task id, the field is renamed to global_repair_id.	2023-02-23 11:29:00 +01:00
Aleksandra Martyniuk	f7c88edec5	repair: delete redundant shard id from logs In repair shard id is logged twice. Delete repeated occurence.	2023-02-23 11:25:18 +01:00
Pavel Emelyanov	5b311bb724	test: Make use of reusable_sst default format The sstable_test_env::reusable_sst() has default value for the format argument. Patch the test cases that don't use one while at it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-22 17:04:10 +03:00
Pavel Emelyanov	7aabffff19	tests: Use reusable_sst() where applicable The reusable_sst() is intented to be used to load the pre-existing sstable from the test/resources directory and .load() them. Some test cases, however, still do it "by hand". Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-22 17:03:15 +03:00
Kefu Chai	5b3fd57c25	build: cmake: extract interface out Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-22 18:35:11 +08:00
Kefu Chai	64879fb6f7	build: cmake: extract redis out and move `redis/protocol_parser.rl` related rules into `redis`, as it is a file used for the implementation of redis. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-22 18:35:11 +08:00
Kefu Chai	43d9055b89	build: cmake: reuse test-lib library it already includes the necessary bits used by test-perf, so let's just link the latter to the former. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-22 18:35:11 +08:00
Kefu Chai	d07b649791	build: cmake: enable boost tests only if BUILD_TESTING BUILD_TESTING is an option exposed by CTest module, so let's include CTest module, and check if BUILD_TESTING is enabled before include boost based tests. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-22 18:35:11 +08:00
Kefu Chai	59698cc495	build: cmake: correct linker flags s/sha/sha1/. turns out `867b58c62c` failed to include the latest change. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-22 18:35:11 +08:00
Aleksandra Martyniuk	b908369e85	compaction: add major_compaction_task_impl All major compaction tasks will share some methods like type or abort. The common part of the tasks should be inherited from major_compaction_task_impl.	2023-02-22 09:52:04 +01:00
Aleksandra Martyniuk	be101078a0	compacition: add pure virtual compaction_task_impl Add compaction_task_impl that is a pure virtual class from which all compaction tasks implementations will inherit.	2023-02-22 09:51:57 +01:00
Pavel Emelyanov	f51762c72a	headers: Refine view_update_generator.hh and around The initial intent was to reduce the fanout of shared_sstable.hh through v.u.g.hh -> cql_test_env.hh chain, but it also resulted in some shots around v.u.g.hh -> database.hh inclusion. By and large: - v.u.g.hh doesn't need database.hh - cql_test_env.hh doesn't need v.u.g.hh (and thus -- the shared_sstable.hh) but needs database.hh instead - few other .cc files need v.u.g.hh directly as they pulled it via cql_test_env.hh before - add forward declarations in few other places Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #12952	2023-02-22 09:32:30 +02:00
Botond Dénes	e183dc4345	Merge 'Wrap sstable directory scan state in components_lister' from Pavel Emelyanov The sstable_directory now combines two activities: * scans the list of files in /var/lib/data and generates sstable-s object from it * maintains the found sstable-s throughout necessary processing (populate/reshard/reshape) The former part is in fact storage-specific. If sstables are on a filesystem, then it should be scanned with listdir, there can be dangling files, like temp-TOC, pending deletion log and comonents not belonging to any TOCs. If sstables are on some other storage, then this part should work some other way. Said that, the sstable_directory is to be split into two pieces -- lister and "processing state". The latter would (may?) require renaming the sstable_directory into something more relevant, but that's huge and intrusive change. For now, just collect the lister stuff in one place. Closes #12843 * github.com:scylladb/scylladb: sstable_directory: Keep lister internals private sstable_directory: Move most of .commit_directory_changes() on lister sstable_directory: Remove temporary aliases sstable_directory: Move most of .process_sstable_dir() on lister sstable_directory: Move .handle_component() to components_lister sstable_directory: Keep files_for_removal on scan_state sstable_directory: Keep components_lister aboard sstable_directory: Keep scan_state on components_lister	2023-02-22 08:10:04 +02:00
Calle Wilund	97881091d3	commitlog: Fix updating of total_size_on_disk on segment alloc when o_dsync is off Fixes #12810 We did not update total_size_on_disk in commitlog totals when use o_dsync was off. This means we essentially ran with no registered footprint, also causing broken comparisons in delete_segments.	2023-02-21 16:35:23 +00:00
Calle Wilund	64102780fe	commitlog: Use static (reused) regex for (left over) descriptor parse Refs #11710 Allows reusing regex for segment matching (for opening left-over segments after crash). Should remove any stalls caused by commitlog replay preparation. v2: Add unit test for descriptor parsing Closes #12112	2023-02-21 18:34:04 +02:00
Botond Dénes	ef548e654d	types: unserialize_value for multiprecision_int,bool: don't read uninitialized memory Check the first fragment before dereferencing it, the fragment might be empty, in which case move to the next one. Found by running range scan tests with random schema and random data. Fixes: #12821 Fixes: #12823 Fixes: #12708 Closes #12824	2023-02-21 17:39:18 +02:00
Tomasz Grabiec	c8e2bf1596	db: schema_tables: Optimize schema merge Currently, applying a schema change on a replica works like this: Collect all affected keyspaces from incoming mutations Read current state of schema Apply the mutations Read new state of schema The "Read ... state of schema" step reads all kinds of schema objects. In particular, to read the "table" objects, it does the following: for every affected keyspace k: read all mutations from system_schema.tables for k extract all existing table names from those mutations for every existing table: read mutations from {tables, columns, indexes, view_virtual_columns, ...} for that table As you can see, the number of reads performed is O(nr tables in a keyspace), not O(nr tables in a change). This means that making a sequence of schema changes, like adding a table, is quadratic. Another aspect which magnifies this is that we don't read those tables using a single scan, but issue individual queries for each table separately. This patch optimizes this by considering only affected tables when reading schema for the purpose of diff calculation. When mutations contain multi-table deletions, we still read the set of tables, like before. This could be optimized by looking at the database to get the list, but it's not part of the patch. I tested this using a test case provided by Kamil (kbr-scylla@53fe154) ./test.py --mode debug test_many_schema_changes -s The test bootstraps a cluster and then creates about 40 schema changes. Then a new node is bootstrapped and replays those changes via group0. In debug mode, each change takes roughly 2s to process before the patch, and 0.5s after the patch. The whole replay is reduced to 56% of what was before: Before (1m19s) : INFO 2023-01-20 19:44:35,848 [shard 0] raft_group0 - setup_group0: ensuring that the cluster has fully upgraded to use Raft... INFO 2023-01-20 19:45:54,844 [shard 0] raft_group0 - setup_group0: waiting for peers to synchronize state... After (45s): INFO 2023-01-20 22:02:51,869 [shard 0] raft_group0 - setup_group0: ensuring that the cluster has fully upgraded to use Raft... INFO 2023-01-20 22:03:36,834 [shard 0] raft_group0 - setup_group0: waiting for peers to synchronize state... Closes #12592 Closes #12592	2023-02-21 17:26:57 +02:00
Calle Wilund	6f972ee68b	commitlog: change type of stored size known_size() is technically not a size_t.	2023-02-21 15:26:02 +00:00
Pavel Emelyanov	abab4d446d	sstable: Remove explicit quarantization call Now all callers are patched to use new change_state() call, so it can be removed. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-21 17:44:55 +03:00
Pavel Emelyanov	bbf192e775	test: Move move_to_new_dir() method from sstable class There's a bunch of test cases that check how moving sstables files around the filesystem works. These need the generic move_to_new_dir() method from sstable, so move it there. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-21 17:42:18 +03:00
Pavel Emelyanov	bb0140531e	sstable, dist.-loader: Introduce and use pick_up_from_upload() method When "uploading" an sstable scylla uses a short-cut -- the sstable's files are to be put into upload/ subdir by the caller, then scylla just pulls them in in the cheapest way possible -- by relinking the files. When this happens sstable also changes its generation, which is the only place where this happens at all. For object storage uploading is not going to be _that_ simple, so for now add an fs-specific method to pick up an sstable from upload dir with the intent to generalize it (if possible) when object-storage uploading appears. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-21 17:40:00 +03:00
Pavel Emelyanov	8a061bd862	sstables, code: Introduce and use change_state() call The call moves the sstable to the specified state. The change state is translated into the storage driver state change which is for todays filesystem storage means moving between directories. The "normal" state maps to the base dir of the table, there's no dedicated subdir for this state and this brings some trouble into the play. The thing is that in order to check if an sstable is in "normal" state already its impossible to compare filename of its path to any pre-defined values, as tables' basdirs are dynamic. To overcome this, the change-state call checks that the sstable is in one of "known" sub-states, and assumes that it's in normal state otherwise. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-21 17:39:34 +03:00
Pavel Emelyanov	e67751ee92	distributed_loader: Let make_sstables_available choose target directory When sstables are loaded from upload/ subdir, the final step is to move them from this directory into base or staging one. The uploading code evaluates the target directory, then pushes it down the stack towards make_sstables_available() method. This patch replaces the path argument with bool to_staging one. The goal is to remove the knowlege of exact sstable location (nowadays -- its files' path) from the distributed loader and keep it in sstable object itself. Next patches will make full use of this change. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-21 17:23:59 +03:00
Botond Dénes	763fe54637	Merge 'build: cmake: sync with `configure.py` (5/n) ' from Kefu Chai - build: cmake: build release.cc as a library - build: cmake: link alternator against cql3 - build: cmake: link scylla against xxHash::xxhash - build: cmake: use lld or gold as linker if available Closes #12942 * github.com:scylladb/scylladb: build: cmake: use lld or gold as linker if available build: cmake: link scylla against xxHash::xxhash build: cmake: link alternator against cql3 build: cmake: build release.cc as a library	2023-02-21 16:19:24 +02:00
Pavel Emelyanov	41d65daa29	sstables: Remove dangling ready future from .close_files() Was left unnoticed while `7c7eb81a` ('Encapsulate filesystem access by sstable into filesystem_storage subsclass') Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #12946	2023-02-21 15:47:55 +02:00
Pavel Emelyanov	398f7704dc	sstable_directory: Keep lister internals private Now the lister procides two-calls API to the user -- process and commit. The rest can and should be marked as private. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-21 16:44:50 +03:00
Pavel Emelyanov	e6941d0baa	sstable_directory: Move most of .commit_directory_changes() on lister Committing any changes made while scanning the storage is storage-specific. Just like .process() was moved on lister, the .commit() now does the same. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-21 16:44:49 +03:00
Pavel Emelyanov	70d6bfc109	sstable_directory: Remove temporary aliases Previous patches created a bunch of local aliases-references in components_lister::process(). This patch just removes those aliases, no functional changes are made here. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-21 16:42:24 +03:00
Pavel Emelyanov	c4037270a3	sstable_directory: Move most of .process_sstable_dir() on lister Processing storage with sstable files/objects is storage-specific. The components_lister is the right components to handle it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-21 16:42:24 +03:00
Pavel Emelyanov	4c4aeba9b6	sstable_directory: Move .handle_component() to components_lister This method is in charge of collecting a found file on scan_state, it logically belogs to the components_lister and its internals. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-21 16:42:24 +03:00
Pavel Emelyanov	58f4076117	sstable_directory: Keep files_for_removal on scan_state This list is the list of on-disk files, which is the property of filesystem scan state. When committing directory changes (read: removing those files) the list can be moved-from the state. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-21 16:42:23 +03:00
Pavel Emelyanov	df5384cb1e	sstable_directory: Keep components_lister aboard The lister is supposed to be alive throughout .process_sstable_dir() and can die after .commit_directory_changes(). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-21 16:32:06 +03:00
Pavel Emelyanov	5d98e34c16	sstable_directory: Keep scan_state on components_lister The scan_state keeps the state of listing directory with sstables. It now lives on the .process_sstable_dir() stack, but it can as well live on the lister itself. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-21 16:32:06 +03:00
Kamil Braun	318f1f64c2	docs: update pygments dependency version Closes #12949	2023-02-21 13:06:39 +02:00
Botond Dénes	372ac57c96	Merge 'doc: remove the incorrect information about IPs from the Restore page' from Anna Stuchlik Fixes https://github.com/scylladb/scylladb/issues/12945 This PR removes the incorrect information and updates the link to the relevant page in the Manager docs. Closes #12947 * github.com:scylladb/scylladb: doc: update the link to the Restore page in the ScyllaDB Manager documentation doc: remove the wrong info about IPs from the note on the Restore page	2023-02-21 12:30:31 +02:00
Kamil Braun	d56c060b4e	Merge 'various raft fixes' from Gleb Natapov The series fixes a race in case of a leader change while add_entry_on_leader is sleeping and an abort during raft shutdown. * '12863-fix-v1' of github.com:scylladb/scylla-dev: raft: abort applier fiber when a state machine aborts raft: fix race in add_entry_on_leader that may cause incorrect log length accounting	2023-02-21 10:57:04 +01:00
Anna Stuchlik	d743146313	doc: update the link to the Restore page in the ScyllaDB Manager documentation	2023-02-21 10:30:02 +01:00
Anna Stuchlik	1e85df776f	doc: remove the wrong info about IPs from the note on the Restore page	2023-02-21 10:24:06 +01:00
Pavel Emelyanov	3f88d3af62	Merge 'test_shed_too_large_request fix: disable compression' from Gusev Petr The test relies on exact request size, this doesn't work if compression is applied. The driver enables compression only if both the server and the client agree on the codec to use. If compression package (e.g. lz4) is not installed, the compression is not used. The trick with locally_supported_compressions is needed since I couldn't find any standard means to disable compression other than the compression flag on the cluster object, which seemed too broad. fixes: #12836 Closes #12854 * github.com:scylladb/scylladb: test_shed_too_large_request: clarify the comments test_shed_too_large_request: use smaller test string test_shed_too_large_request fix: disable compression	2023-02-21 10:35:59 +03:00
Kefu Chai	867b58c62c	build: cmake: use lld or gold as linker if available Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-21 14:24:18 +08:00
Kefu Chai	69b1e7651e	build: cmake: link scylla against xxHash::xxhash instead of adding `XXH_PRIVATE_API` to compile definitions, link scylla against xxHash::xxhash, which provides this definition for us. also move the comment on `XXH_PRIVATE_API` into `FindxxHash.cmake`, where this definition is added. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-21 14:24:18 +08:00
Kefu Chai	0fffd34be8	build: cmake: link alternator against cql3 otherwise we'd have ``` In file included from /home/kefu/dev/scylladb/alternator/executor.cc:37: /home/kefu/dev/scylladb/cql3/util.hh:21:10: fatal error: 'cql3/CqlParser.hpp' file not found ^~~~~~~~~~~~~~~~~~~~ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-21 14:24:18 +08:00
Kefu Chai	957403663f	build: cmake: build release.cc as a library so we can attach compiling definitions in a simpler way. this change is based on Botond Dénes's change which gives an overhaul to the existing CMake building system. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-21 14:23:04 +08:00
Botond Dénes	d7b6cf045f	Merge 'build: cmake: sync with `configure.py` (4/n)' from Kefu Chai - build: cmake: link cql3 against wasmtime_bindings - build: cmake: output rust binding headers in expected dir - build: cmake: link auth against cql3 Closes #12927 * github.com:scylladb/scylladb: build: cmake: link auth against cql3 build: cmake: output rust binding headers in expected dir build: cmake: link cql3 against wasmtime_bindings	2023-02-20 12:46:15 +01:00
Botond Dénes	3c30531202	Merge 'test: mutation_test: Fix sporadic failure due to continuity mismatch' from Tomasz Grabiec In test_v2_apply_monotonically_is_monotonic_on_alloc_failures we generate mutations with non-full continuity, so we should pass is_evictable::yes to apply_monotonically(). Otherwise, it will assume fully-continuous versions and not try to maintain continuity by inserting sentinels. This manifested in sporadic failures on continuity check. Fixes #12882 Closes #12921 * github.com:scylladb/scylladb: test: mutation_test: Fix sporadic failure due to continuity mismatch test: mutation_test: Fix copy-paste mistake in trace-level logging	2023-02-20 12:46:15 +01:00
Pavel Emelyanov	273999b9fa	sstable: Mark version and format members const These two are indeed immutable throughout the object lifetime Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #12918	2023-02-20 12:46:15 +01:00
Kefu Chai	adbcc3db8f	dist/debian: drop unused Makefile variable this change was previously reverted by `cbc005c6f5` . it turns out this change was but the offending change. so let's resurrect it. `job` was introduced back in `782ebcece4`, so we could consume the option specified in DEB_BUILD_OPTIONS environmental variable. but now that we always repackage the artifacts prebuilt in the relocatable package. we don't build them anymore when packaging debian packages. see `9388f3d626` . and `job` is not passed to `ninja` anymore. so, in this change, `job` is removed from debian/rules as well, as it is not used. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12924	2023-02-20 12:46:15 +01:00
Nadav Har'El	328cdb2124	cql-pytest: translate Cassandra's tests for compact tables This is a translation of Cassandra's CQL unit test source file validation/operations/CompactStorageTest.java into our cql-pytest framework. This very large test file includes 86 tests for various types of operations and corner cases of WITH COMPACT STORAGE tables. All 86 tests pass on Cassandra (except one using a deprecated feature that needs to be specially enabled). 30 of the tests fail on Scylla reproducing 7 already-known Scylla issues and 7 previously-unknown issues: Already known issues: Refs #3882: Support "ALTER TABLE DROP COMPACT STORAGE" Refs #4244: Add support for mixing token, multi- and single-column restrictions Refs #5361: LIMIT doesn't work when using GROUP BY Refs #5362: LIMIT is not doing it right when using GROUP BY Refs #5363: PER PARTITION LIMIT doesn't work right when using GROUP BY Refs #7735: CQL parser missing support for Cassandra 3.10's new "+=" syntax Refs #8627: Cleanly reject updates with indexed values where value > 64k New issues: Refs #12471: Range deletions on COMPACT STORAGE is not supported Refs #12474: DELETE prints misleading error message suggesting ALLOW FILTERING would work Refs #12477: Combination of COUNT with GROUP BY is different from Cassandra in case of no matches Refs #12479: SELECT DISTINCT should refuse GROUP BY with clustering column Refs #12526: Support filtering on COMPACT tables Refs #12749: Unsupported empty clustering key in COMPACT table Refs #12815: Hidden column "value" in compact table isn't completely hidden Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12816	2023-02-20 12:46:15 +01:00
Raphael S. Carvalho	fbeee8b65d	Optimize load-and-stream load-and-stream implements no policy when deciding which SSTables will go in each streaming round (batch of 16 SSTables), meaning the choice is random. It can take advantage of the fact that the LSM-tree layout, with ICS and LCS, is a set of SSTable runs, where each run is composed of SSTables that are disjoint in their key range. By sorting SSTables to be streamed by their first key, the effect is that SSTable runs will be incrementally streamed (in token order). SSTable runs in the same replica group (or in the same node) will have their content deduplicated, reducing significantly the amount of data we need to put on the wire. The improvement is proportional to the space amplification in the table, which again, depends on the compaction strategy used. Another important benefit is that the destination nodes will receive SSTables in token order, allowing off-strategy compaction to be more efficient. This is how I tested it: 1) Generated a 5GB dataset to a ICS table. 2) Started a fresh 2-node cluster. RF=2. 3) Ran load-and-stream against one of the replicas. BEFORE: $ time curl -X POST "http://127.0.0.1:10000/storage_service/sstables/keyspace1?cf=standard1&load_and_stream=true" real 4m40.613s user 0m0.005s sys 0m0.007s AFTER: $ time curl -X POST "http://127.0.0.1:10000/storage_service/sstables/keyspace1?cf=standard1&load_and_stream=true" real 2m39.271s user 0m0.005s sys 0m0.004s That's ~1.76x faster. That's explained by deduplication: BEFORE: INFO 2023-02-17 22:59:01,100 [shard 0] stream_session - [Stream #79d3ce7a-ea47-4b6e-9214-930610a18ccd] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=3445376, received_partitions=2755835 INFO 2023-02-17 22:59:41,491 [shard 0] stream_session - [Stream #bc6bad99-4438-4e1e-92db-b2cb394039c8] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=3308288, received_partitions=2836491 INFO 2023-02-17 23:00:20,585 [shard 0] stream_session - [Stream #e95c4f49-0a2f-47ea-b41f-d900dd87ead5] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=3129088, received_partitions=2734029 INFO 2023-02-17 23:00:49,297 [shard 0] stream_session - [Stream #255cba95-a099-4fec-a72c-f87d5cac2b1d] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=2544128, received_partitions=1959370 INFO 2023-02-17 23:01:33,110 [shard 0] stream_session - [Stream #96b5737e-30c7-4af8-a8b8-96fecbcbcbd0] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=3624576, received_partitions=3085681 INFO 2023-02-17 23:02:20,909 [shard 0] stream_session - [Stream #3185a48b-fb9e-4190-88f4-5c7a386bc9bd] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=3505024, received_partitions=3079345 INFO 2023-02-17 23:03:02,039 [shard 0] stream_session - [Stream #0d2964dc-d5e3-4775-825c-97f736d14713] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=2808192, received_partitions=2655811 AFTER: INFO 2023-02-17 23:12:49,155 [shard 0] stream_session - [Stream #bf00963c-3334-4035-b1a9-4b3ceb7a188a] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=2965376, received_partitions=1006535 INFO 2023-02-17 23:13:13,365 [shard 0] stream_session - [Stream #1cd2e3ac-a68b-4cb5-8a06-707e91cf59db] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=3543936, received_partitions=1406157 INFO 2023-02-17 23:13:37,474 [shard 0] stream_session - [Stream #5a278230-6b4b-461f-8396-c15df7092d03] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=3639936, received_partitions=1371298 INFO 2023-02-17 23:14:02,132 [shard 0] stream_session - [Stream #19f40dc3-e02a-4321-a917-a6590d99dd03] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=3638912, received_partitions=1435386 INFO 2023-02-17 23:14:26,673 [shard 0] stream_session - [Stream #d47507eb-2067-4e8f-a4f7-c82d5fbd4228] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=3561600, received_partitions=1423024 INFO 2023-02-17 23:14:49,307 [shard 0] stream_session - [Stream #d42ee911-253a-4de6-ac89-6a3c05b88d66] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=2382592, received_partitions=1452656 INFO 2023-02-17 23:15:10,067 [shard 0] stream_session - [Stream #1f78c1bf-8e20-41bd-95de-16de3fc5f86c] Write to sstable for ks=keyspace1, cf=standard1, estimated_partitions=2632320, received_partitions=1252298 Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20230219191924.37070-1-raphaelsc@scylladb.com>	2023-02-20 12:46:14 +01:00
guy9	917e085919	Update manager-monitoring-integration.rst Changing default manager from 56090 to 5090 @amnonh please review @annastuchlik please change if other locations in Docs require this change Closes #12682	2023-02-20 12:46:14 +01:00
Avi Kivity	6d5c242651	Update tools/java submodule (hdrhistogram failure with Java 11) * tools/java f0bab7af66...ab0a613fdc (1): > Fix cassandra-stress -log hdrfile=... with java 11	2023-02-20 12:46:14 +01:00
Aleksandra Martyniuk	4f67c0c36a	compaction: add compaction module getter to compaction manager	2023-02-20 11:19:29 +01:00
Kefu Chai	df63e2ba27	types: move types.{cc,hh} into types they are part of the CQL type system, and are "closer" to types. let's move them into "types" directory. the building systems are updated accordingly. the source files referencing `types.hh` were updated using following command: ``` find . -name "*.{cc,hh}" -exec sed -i 's/\"types.hh\"/\"types\/types.hh\"/' {} + ``` the source files under sstables include "types.hh", which is indeed the one located under "sstables", so include "sstables/types.hh" instea, so it's more explicit. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12926	2023-02-19 21:05:45 +02:00
Tzach Livyatan	f97a23a9e3	Add a warnining: altering a service level timeout doesn't affect existing connections Closes #12928 Refs #12923	2023-02-19 14:49:23 +02:00
Kefu Chai	ee97c332d9	test/boost/chunked_vector_test: remove defaulted exception_safe_class's move ctor because it has a member variable whose type is a reference. and a reference cannot be reassigned. this silences following warning from Clang: ``` /home/kefu/dev/scylladb/test/boost/chunked_vector_test.cc:152:27: error: explicitly defaulted move assignment operator is implicitly deleted [-Werror,-Wdefaulted-function-deleted] exception_safe_class& operator=(exception_safe_class&&) = default; ^ /home/kefu/dev/scylladb/test/boost/chunked_vector_test.cc:132:31: note: move assignment operator of 'exception_safe_class' is implicitly deleted because field '_esc' is of reference type 'exception_safety_checker &' exception_safety_checker& _esc; ^ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-19 12:58:22 +08:00
Kefu Chai	2bb61b8c18	tools/scylla-sstable: remove defaulted move ctor ``` /home/kefu/dev/scylladb/tools/scylla-sstable.cc:2301:9: error: explicitly defaulted move constructor is implicitly deleted [-Werror,-Wdefaulted-function-deleted] impl(impl&&) = default; ^ /home/kefu/dev/scylladb/tools/scylla-sstable.cc:2291:16: note: move constructor of 'impl' is implicitly deleted because field '_reader' has an inaccessible move constructor reader _reader; ^ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-19 12:57:40 +08:00
Kefu Chai	cca9b7c4cd	sstables/mx/partition_reversing_data_source: remove defaulted move ctor as partition_reversing_data_source_impl has indirectly a member variable which a member of reference type. this should addres following warning from Clang: ``` /home/kefu/dev/scylladb/sstables/mx/partition_reversing_data_source.cc:476:43: error: explicitly defaulted move assignment operator is implicitly deleted [-Werror,-Wdefaulted-function-deleted] partition_reversing_data_source_impl& operator=(partition_reversing_data_source_impl&&) noexcept = default; ^ /home/kefu/dev/scylladb/sstables/mx/partition_reversing_data_source.cc:365:19: note: move assignment operator of 'partition_reversing_data_source_impl' is implicitly deleted because field '_schema' is of reference type 'const schema &' const schema& _schema; ^ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-19 12:57:40 +08:00
Kefu Chai	958f8bf79f	cql3/statements/truncate_statement: remove defaulted move ctor ``` /home/kefu/dev/scylladb/cql3/statements/truncate_statement.hh:29:5: error: explicitly defaulted move constructor is implicitly deleted [-Werror,-Wdefaulted-function-deleted] truncate_statement(truncate_statement&&) = default; ^ /home/kefu/dev/scylladb/cql3/statements/truncate_statement.hh:25:39: note: move constructor of 'truncate_statement' is implicitly deleted because field '_attrs' has a deleted move constructor const std::unique_ptr<attributes> _attrs; ^ /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/unique_ptr.h:523:7: note: 'unique_ptr' has been explicitly marked deleted here unique_ptr(const unique_ptr&) = delete; ^ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-19 12:57:40 +08:00
Kefu Chai	6803f38a7a	build: cmake: link auth against cql3 as auth headers references cql3 ``` In file included from /home/kefu/dev/scylladb/auth/authenticator.cc:16: In file included from /home/kefu/dev/scylladb/cql3/query_processor.hh:24: /home/kefu/dev/scylladb/lang/wasm_instance_cache.hh:20:10: fatal error: 'rust/cxx.h' file not found ^~~~~~~~~~~~ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-19 12:46:51 +08:00
Kefu Chai	a2668f8ba8	build: cmake: output rust binding headers in expected dir we include rust binding headers like `rust/wasmtime_bindings.hh`. so they should be located in directory named "rust". Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-19 12:46:51 +08:00
Kefu Chai	494ed41a54	build: cmake: link cql3 against wasmtime_bindings as it references headers provided by wasmtime_bindings: ``` In file included from /home/kefu/dev/scylladb/cql3/functions/user_function.cc:9: In file included from /home/kefu/dev/scylladb/cql3/functions/user_function.hh:16: /home/kefu/dev/scylladb/lang/wasm.hh:14:10: fatal error: 'rust/wasmtime_bindings.hh' file not found ^~~~~~~~~~~~~~~~~~~~~~~~~~~ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-19 12:46:51 +08:00
Gleb Natapov	941407b905	database: fix do_apply_many() to handle empty array of mutations Currently the code will assert because cl pointer will be null and it will be null because there is no mutations to initialize it from. Message-Id: <20230212144837.2276080-3-gleb@scylladb.com>	2023-02-17 22:58:22 +01:00
Yaron Kaikov	a4e08ee48a	Revert "dist/debian: bump up debhelper compatibility level to 10" This reverts commit `75eaee040b`. Since it's causing a regression preventing from Scylla service to start in deb OS Fixes: #12738 Closes #12897	2023-02-17 17:34:12 +02:00
Michał Chojnowski	e88f590eda	sstables: partition_index_cache: clean up an unused type alias `list_ptr` is a type alias that isn't used in any meaningful way. Remove it. Closes #10978	2023-02-17 17:58:26 +03:00
Tomasz Grabiec	2ae8f74cec	test: mutation_test: Fix sporadic failure due to continuity mismatch In test_v2_apply_monotonically_is_monotonic_on_alloc_failures we generate mutations with non-full continuity, so we should pass is_evictable::yes to apply_monotonically(). Otherwise, it will assume fully-continuous versions and not try to maintain continuity by inserting sentinels. This manifested in sporadic failures on continuity check. Fixes #12882	2023-02-17 14:43:32 +01:00
Tomasz Grabiec	22063713d7	test: mutation_test: Fix copy-paste mistake in trace-level logging	2023-02-17 14:42:47 +01:00
Botond Dénes	f62e62f151	Merge 'build: cmake: sync with `configure.py` (3/n)' from Kefu Chai * build: cmake: add test * build: cmake: expose the bridged rust library * build: cmake: correct library path * build: cmake: add missing source files * build: cmake: put generated sources into ${scylla_gen_build_dir} * build: cmake: silence -Wuninitialized warning * build: cmake: extract idl library out * build: cmake: ignore -Wparentheses-equality Closes #12893 * github.com:scylladb/scylladb: build: cmake: add unit tests build: cmake: extract sstables out build: cmake: extract auth and schema build: utils: extract utils out build: cmake: link Boost::regex with ICU::i18n build: cmake: add test build: cmake: expose the bridged rust library build: cmake: correct library path build: cmake: add missing source files build: cmake: put generated sources into ${scylla_gen_build_dir} build: cmake: silence -Wuninitialized warning build: cmake: extract idl library out build: cmake: ignore -Wparentheses-equality	2023-02-17 13:13:01 +02:00
Kefu Chai	05ecc3f1c9	build: cmake: add unit tests this change is based on Botond Dénes's change which gave an overhaul to the original CMake building system. this change is not enough to build tests with CMake, as we still need to sort out the dependencies. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-17 18:41:40 +08:00
Kefu Chai	f76a169025	build: cmake: extract sstables out Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-17 18:41:40 +08:00
Kefu Chai	f3714f1706	build: cmake: extract auth and schema Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-17 18:41:40 +08:00
Kefu Chai	3e481c9d15	build: utils: extract utils out Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-17 18:41:39 +08:00
Kefu Chai	4d7ae07e9e	build: cmake: link Boost::regex with ICU::i18n it turns out Boost::regex references ICU::i18n, but it fails to bring the linkage to its public interface. so let's do this on behalf of it. ``` : && /home/kefu/.local/bin/clang++ -Wall -Werror -Wno-c++11-narrowing -Wno-mismatched-tags -Wno-missing-braces -Wno-overloaded-virtual -Wno-parentheses-equality -Wno-unsupported-friend -march=westmere -O0 -g -gz CMakeFiles/scylla.dir/absl-flat_hash_map.cc.o CMakeFiles/$ ld.lld: error: undefined symbol: icu_67::Collator::createInstance(icu_67::Locale const&, UErrorCode&) >>> referenced by icu.hpp:56 (/usr/include/boost/regex/icu.hpp:56) >>> CMakeFiles/scylla.dir/utils/like_matcher.cc.o:(boost::re_detail_107500::icu_regex_traits_implementation::icu_regex_traits_implementation(icu_67::Locale const&)) >>> referenced by icu.hpp:61 (/usr/include/boost/regex/icu.hpp:61) >>> CMakeFiles/scylla.dir/utils/like_matcher.cc.o:(boost::re_detail_107500::icu_regex_traits_implementation::icu_regex_traits_implementation(icu_67::Locale const&)) ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-17 18:39:44 +08:00
Kefu Chai	02de9f1833	build: cmake: add test Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-17 18:39:44 +08:00
Kefu Chai	f5750859f7	build: cmake: expose the bridged rust library so that scylla can be linked against it when it is linked with wasmtime_bindings. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-17 18:39:44 +08:00
Kefu Chai	7569424d86	build: cmake: correct library path it encodes the profile in it. so, in this change, the used profile is added in the path. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-17 18:39:44 +08:00
Kefu Chai	affebc35be	build: cmake: add missing source files Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-17 18:39:43 +08:00
Kefu Chai	c0824c6c25	build: cmake: put generated sources into ${scylla_gen_build_dir} to be aligned with the convention of configure.py Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-17 18:38:44 +08:00
Kefu Chai	db8a2c15fa	build: cmake: silence -Wuninitialized warning Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-17 18:38:44 +08:00
Kefu Chai	7b431748a8	build: cmake: extract idl library out Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-17 18:38:44 +08:00
Kefu Chai	d89602c6a2	build: cmake: ignore -Wparentheses-equality antlr3 generates code like `((foo == bar))`. but Clang does not like it. let's disable this warning. and explore other options later. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-17 18:38:44 +08:00
Avi Kivity	7fc7cbd3bf	build: nix: switch to non-static zstd When we added zstd (`f14e6e73bb`), we used the static library as we used some experimental APIs. However, now the dynamic library works, so apparently the experimenal API is now standard. Switch to the dynamic library. It doesn't improve anything, but it aligns with how we do things. Closes #12902	2023-02-17 10:29:34 +02:00
Avi Kivity	ae3489382e	build: nix: update clang Clang 15 is now packaged by Nix, so use it. Closes #12901	2023-02-17 10:26:44 +02:00
Kefu Chai	50f68fe475	test/perf: do not brace interger with {} `int_range::make_singular()` accepts a single `int` as its parameter, so there is no need to brace the paramter with `{}`. this helps to silence the warning from Clang, like: ``` /home/kefu/dev/scylladb/test/perf/perf_fast_forward.cc:1396:63: error: braces around scalar initializer [-Werror,-Wbraced-scalar-init] check_no_disk_reads(test(int_range::make_singular({100}))), ^~~~~ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12903	2023-02-17 10:24:24 +02:00
Botond Dénes	2b1f10a41c	Merge 'doc: add a KB about the new tombstones compaction process in ICS' from Anna Stuchlik Fixes https://github.com/scylladb/scylla-docs/issues/4140 This PR adds a new Knowledge Base article about improved garbage collection in ICS. It's based on the document created by @raphaelsc https://docs.google.com/document/d/1fA7uBcN9tgxeHwCbWftPJz071dlhucoOYO1-KJeOG8I/edit?usp=sharing. @raphaelsc Could you review it? I've made some improvements to the language and text organization, but I didn't add or remove any content, so it should be a quick review. @tzach requested a diagram, but we can add it later. It would be great to have this content published asap. Closes #12792 * github.com:scylladb/scylladb: doc: add the new KB to the list of topics doc: add a new KB article about timbstone garbage collection in ICS	2023-02-17 10:20:01 +02:00
Aleksandra Martyniuk	5d826f13e7	api: move get_and_update_ttl to task manager api Task ttl can be set with task manager test api, which is disabled in release mode. Move get_and_update_ttl from task manager test api to task manager api, so that it can be used in release mode. Closes #12894	2023-02-17 10:19:06 +02:00
Piotr Smaroń	d2bfe124ad	doc: fix command invoking tests The developer documentation from `building.md` suggested to run unit tests with `./tools/toolchain/dbuild test` command, however this command only invokes `test` bash tool, which immediately returns with status `1`: ``` [piotrs@new-host scylladb]$ ./tools/toolchain/dbuild test [piotrs@new-host scylladb]$ echo $? 1 ``` This was probably unintended mistake and what author really meant was invoking `dbuild ninja test`. Closes #12890	2023-02-17 10:16:33 +02:00
Botond Dénes	0961a3f79b	test/boost/reader_concurreny_semaphore_test: run oom protection tests in debug mode Said tests require on being run with a limited amount of memory to be really useful. When the memory amount is unexpected, they silently exit. Which is exactly what they did in debug mode too, where the amount of memory available cannot be controlled. Disable the check in debug mode.	2023-02-17 00:46:56 -05:00
Botond Dénes	1a9fdebb49	treewide: adapt to throwing reader_concurrency_semaphore::consume() Said method can now throw `std::bad_alloc` since `aab5954`. All call-sites should have been adapted in the series introducing the throw, but some managed to slip through because the oom unit test didn't run in debug mode. In this commit the remaining unpatched call-sites are fixed.	2023-02-17 00:46:56 -05:00
Avi Kivity	e2f6e0b848	utils: move hashing related files to utils/ module Closes #12884	2023-02-17 07:19:52 +02:00
Kefu Chai	2f0cb9e68f	db/virtual_table: mark the dtor of base class `virtual` as `my_result_collector` has virtual function, and its dtor is not marked virtual, Clang complains. let's mark its base class virtual to be on the safe side. ``` /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/unique_ptr.h:100:2: error: delete called on non-final 'my_result_collector' that has virtual functions but non-virtual destructor [-Werror,-Wdelete-non-abstract-non-virtual-dtor] delete __ptr; ^ /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/unique_ptr.h:405:4: note: in instantiation of member function 'std::default_delete<my_result_collector>::operator()' requested here get_deleter()(std::move(__ptr)); ^ /home/kefu/dev/scylladb/db/virtual_table.cc:134:25: note: in instantiation of member function 'std::unique_ptr<my_result_collector>::~unique_ptr' requested here auto consumer = std::make_unique<my_result_collector>(s, permit, &pr, std::move(reader_and_handle.second)); ^ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12879	2023-02-17 07:11:18 +02:00
Botond Dénes	79bf347e04	Merge 'Remove sstables::test_setup in favor of sstables::test_env' from Pavel Emelyanov The former is a convenience wrapper over the latter. There's no real benefit in using it, but having two test_env-s is worse than just one. Closes #12794 * github.com:scylladb/scylladb: sstable_utils: Move the test_setup to perf/ sstable_utils: Remove unused wrappers over test_env sstable_test: Open-code do_with_cloned_tmp_directory sstable_test: Asynchronize statistics_rewrite case tests: Replace test_setup::do_with_tmp_directory with test_env::do_with(_async)?	2023-02-17 07:09:34 +02:00
Anna Stuchlik	bcca706ff5	doc: fixes https://github.com/scylladb/scylladb/issues/12754 , document the metric update in 5.2 Closes #12891	2023-02-16 19:05:48 +02:00
Nadav Har'El	02682aa40d	test/cql-pytest: add reproducer for ALLOW FILTERING bug This patch adds a reproducer for the bug described in issue #7964 - The restriction `where k=1 and c=2` (when k,c are the key columns) returns (at most) a single row so doesn't need ALLOW FILTERING, but if we add a third restriction, say `v=2`, this still processes at most a single row so doesn't need ALLOW FILTERING - and both Scylla and Cassandra get it wrong - so it's marked with both xfail and cassandra_bug. The patch also adds another test that for longer partition slices, e.g., `where k=1 and c>2`, although the slice itself doesn't need filtering, if we add `v=2` here we suddenly do need ALLOW FILTERING, because the slice itself may be a large number of rows, and adding `v=2` may restrict it to just a few results. This test passes on both Scylla and Cassandra. Issue #7964 mentioned these scenarios and even had some example code, but we never added it to the test suite, so we finally do it now. Refs #7964 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12850	2023-02-16 19:05:48 +02:00
Botond Dénes	dc3d47b1e4	Merge 'Get compaction history without using qctx' from Pavel Emelyanov There are two methods to mess with compaction history -- update and get. The former had been patched to use local system-keyspace instance by `907fd2d3` (system_keyspace: De-static compaction history update) now it's time for the latter (spoiler: it's only used by the API handler) Closes #12889 * github.com:scylladb/scylladb: system_keyspace; Make get_compaction_history non static and drop qctx api, compaction_manager: Get compaction history via manager system_keyspace: Move compaction_history_entry to namespace scope	2023-02-16 19:05:48 +02:00
Anna Stuchlik	826f67a298	doc: related https://github.com/scylladb/scylladb/issues/12658 , fix the service name in the upgrade guide from 2022.1 to 2022.2 Closes #12698	2023-02-16 19:05:48 +02:00
Botond Dénes	87f7ac920e	Merge 'Add task manager utils for tests' from Aleksandra Martyniuk Tests of each module that is integrated with task manager use calls to task manager api. Boilerplate to call, check status, and get result may be reduced using functions. task_manager_utils.py contains wrappers for task manager api calls and helpers that may be reused by different tests. Closes #12844 * github.com:scylladb/scylladb: test: use functions from task_manager_utils.py in test_task_manager.py test: add task_manager_utils.py	2023-02-16 19:05:48 +02:00
Kefu Chai	fcdea9f950	test/perf: mark output_writer::~output_writer() as virtual as an abstract base class `output_writer` is inherited by both `json_output_writer` and `text_output_writer`. and `output_manager` manages the lifecycles of used writers using `std::unique_ptr<output_writer>`. before this change, the dtor of `output_writer` is not marked as virtual, so when its dtor is invoked, what gets called is the base class's dtor. but the dtor of `json_output_writer` is non-trivial in the sense that this class is aggregated by a bunch of member variables. if we don't invoke its dtor when destroying this object, leakage is expected. so, in this change, the dtor of `output_writer` is marked as virtual, this makes all of its derived classes' dtor virtual. and the right dtor is always called. test/perf is only designed for testing, and not used in production, also, this feature was recently integrated into scylla executable in `228ccdc1c7`. so there is no need to backport this change. change should also silence the warning from Clang 17: ``` /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/unique_ptr.h:100:2: error: delete called on 'output_writer' that is abstract but has non-virtual destructor [-Werror,-Wdelete-abstract-non-virtual-dtor] delete __ptr; ^ /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/unique_ptr.h:405:4: note: in instantiation of member function 'std::default_delete<output_writer>::operator()' requested here get_deleter()(std::move(__ptr)); ^ /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/stl_construct.h:88:15: note: in instantiation of member function 'std::unique_ptr<output_writer>::~unique_ptr' requested here __location->~_Tp(); ^ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12888	2023-02-16 19:05:48 +02:00
Nadav Har'El	27ea908c69	test/cql-pytest: regression test for old secondary-index bug This patch adds a cql-pytest test for an old secondary-index bug that was described three years ago in issue #5823. cql-pytest makes it easy to run the same test against different versions of Scylla, and it was used to check that the bug existed in Scylla 2.3.0 but was gone by 2.3.5, and also not present in master or in 2021.1. A bit about the bug itself: A secondary index is useful for equality restrictions (a=2) but can't be used for inequality restrictions (a>=2). In Scylla 3.2.0 we used to have a bug that because the restriction a>=2 couldn't be used through the index, it was ignored completely. This is of course a mistake. Refs #5823 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12856	2023-02-16 19:05:48 +02:00
Alejo Sanchez	16d92b7042	test/topology: pytest driver version use print... instead of log Use print instead of logging. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Closes #12846	2023-02-16 19:05:48 +02:00
Kefu Chai	9520acb1a1	logalloc: mark segment_store_backend's virtual before this change, `seastar_memory_segment_store_backend` is class with virtual method, but it does not have a virtual dtor. but we do use a unique_ptr<segment_store_backend> to manage the lifecycle of an intance of its derived class. to enable the compiler to call the right dtor, we should mark the base class's dtor as virtual. this should address following warings from Clang-17: ``` /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/unique_ptr.h:100:2: error: delete called on non-final 'logalloc::seastar_memory_segment_store_backend' that has virtual functions but non-virtual destructor [-Werror,-Wdelete-non-abstract-non-virtual-dtor] delete __ptr; ^ /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/unique_ptr.h:405:4: note: in instantiation of member function 'std::default_delete<logalloc::seastar_memory_segment_store_backend>::operator()' requested here get_deleter()(std::move(__ptr)); ^ /home/kefu/dev/scylladb/utils/logalloc.cc:812:20: note: in instantiation of member function 'std::unique_ptr<logalloc::seastar_memory_segment_store_backend>::~unique_ptr' requested here : _backend(std::make_unique<seastar_memory_segment_store_backend>()) ^ ``` and ``` /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/unique_ptr.h:100:2: error: delete called on 'logalloc::segment_store_backend' that is abstract but has non-virtual destructor [-Werror,-Wdelete-abstract-non-virtual-dtor] delete __ptr; ^ /home/kefu/.local/bin/../lib/gcc/x86_64-pc-linux-gnu/13.0.1/../../../../include/c++/13.0.1/bits/unique_ptr.h:405:4: note: in instantiation of member function 'std::default_delete<logalloc::segment_store_backend>::operator()' requested here get_deleter()(std::move(__ptr)); ^ /home/kefu/dev/scylladb/utils/logalloc.cc:811:5: note: in instantiation of member function 'std::unique_ptr<logalloc::segment_store_backend>::~unique_ptr' requested here contiguous_memory_segment_store() ^ ``` Fixes #12872 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12873	2023-02-16 19:05:48 +02:00
Avi Kivity	abe157a873	Drop intrusive_set_external_comparator Since `5c0f9a8180` ("mutation_partition: Switch cache of rows onto B-tree") it's no longer in use, except in some performance test, so remove it. Although scylla-gdb.py is sometimes used with older releases, it's so outdated we can remove it from there too. Closes #12868	2023-02-16 19:05:48 +02:00
Kefu Chai	6eab8720c4	tools/schema_loader: do not return ref to a local variable we should never return a reference to local variable. so in this change, a reference to a static variable is returned instead. this should address following warning from Clang 17: ``` /home/kefu/dev/scylladb/tools/schema_loader.cc:146:16: error: returning reference to local temporary object [-Werror,-Wreturn-stack-address] return {}; ^~ ``` Fixes #12875 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12876	2023-02-16 12:15:14 +02:00
Pavel Emelyanov	e234726123	system_keyspace; Make get_compaction_history non static and drop qctx Now the call is done via the system_keyspace instance, so it can be unmarked static and can use the local query processor instead of global qctx. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-16 11:28:04 +03:00
Pavel Emelyanov	52f69643b6	api, compaction_manager: Get compaction history via manager Right now the API handler directly calls static method from system keyspace. Patching it to call compaction manager instead will let the latter use on-board plugged system keyspace for that. If the system keyspace is not plugged, it means early boot or late shutdown, not a good time to get compaction history anyway. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-16 11:27:38 +03:00
Pavel Emelyanov	d0e47ace16	system_keyspace: Move compaction_history_entry to namespace scope It's now a sub-class and it makes forward-declaration in another unit impossible Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-16 11:24:23 +03:00
Takuya ASADA	bf27fdeaa2	scylla_coredump_setup: fix coredump timeout settings We currently configure only TimeoutStartSec, but probably it's not enough to prevent coredump timeout, since TimeoutStartSec is maximum waiting time for service startup, and there is another directive to specify maximum service running time (RuntimeMaxSec). To fix the problem, we should specify RunTimeMaxSec and TimeoutSec (it configures both TimeoutStartSec and TimeoutStopSec). Fixes #5430 Closes #12757	2023-02-16 10:23:20 +02:00
Botond Dénes	e9258018d9	Merge 'date: cleanups to silence warnings from clang' from Kefu Chai - date: drop implicitly generated ctor - date: use std::in_range() to check for invalid year Closes #12878 * github.com:scylladb/scylladb: date: use std::in_range() to check for invalid year date: drop implicitly generated ctor	2023-02-16 10:15:36 +02:00
Botond Dénes	ef50170120	Merge 'build: cmake: sync with configure (2/n)' from Kefu Chai * build: cmake: extract idl out * build: cmake: link cql3 against xxHash * build: cmake: correct the check in Findlibdeflate.cmake * build: cmake find_package(libdeflate) earlier * build: cmake: set more properties to alternator library * build: cmake: include generate_cql_grammar * build: cmake: find xxHash package * build: cmake: add build mode support Closes #12866 * github.com:scylladb/scylladb: build: cmake: correct generate_cql_grammar build: cmake: extract idl out build: cmake: link cql3 against xxHash build: cmake: correct the check in Findlibdeflate.cmake build: cmake: find_package(libdeflate) earlier build: cmake: set more properties to alternator library build: cmake: include generate_cql_grammar build: cmake: find xxHash package build: cmake: add build mode support	2023-02-16 07:11:26 +02:00
Pavel Emelyanov	737f4acc10	features: Enable persisted features on all shards Commit `1365e2f13e` (gms: feature_service: re-enable features on node startup) re-enabled features on feature service very early, so that on boot a node sees its "correct" features state before it starts loading system tables and replaying commitlog. However, checking features happens on all shards independently, so re-enabling should also happen on all shards. One faced problem is in extract_scylla_specific_keyspace_info(). This helper is used when loading non-system keyspace to read scylla-specific keyspace options. The helper is called on all shards and on all-but-zero it evaluates the checked SCYLLA_KEYSPACES feature to false leaving the specific data empty. As the result, different shards have different view of keyspaces' configuration. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #12881	2023-02-16 00:52:05 +01:00
Kefu Chai	45f0449ccf	sstables: mx/writer: remove defaulted move ctor because its base class of `writer_impl` has a member variable `_validator`, which has its copy ctor deleted. let's just drop the defaulted move ctor, as compiler is not able to generate one for us. ``` /home/kefu/dev/scylladb/sstables/mx/writer.cc:805:5: error: explicitly defaulted move constructor is implicitly deleted [-Werror,-Wdefaulted-function-deleted] writer(writer&& o) = default; ^ /home/kefu/dev/scylladb/sstables/mx/writer.cc:528:16: note: move constructor of 'writer' is implicitly deleted because base class 'sstable_writer::writer_impl' has a deleted move constructor class writer : public sstable_writer::writer_impl { ^ /home/kefu/dev/scylladb/sstables/writer_impl.hh:29:48: note: copy constructor of 'writer_impl' is implicitly deleted because field '_validator' has a deleted copy constructor mutation_fragment_stream_validating_filter _validator; ^ /home/kefu/dev/scylladb/mutation/mutation_fragment_stream_validator.hh:188:5: note: 'mutation_fragment_stream_validating_filter' has been explicitly marked deleted here mutation_fragment_stream_validating_filter(const mutation_fragment_stream_validating_filter&) = delete; ^ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12877	2023-02-15 23:06:10 +02:00
Kefu Chai	0cb842797a	treewide: do not define/capture unused variables these warnings are found by Clang-17 after removing `-Wno-unused-lambda-capture` and '-Wno-unused-variable' from the list of disabled warnings in `configure.py`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-15 22:57:18 +02:00
Avi Kivity	ac2a69aab4	Merge 'Move population code into table_population_metadata' from Pavel Emelyanov There's the distribtued_loader::populate_column_family() helper that manages sstables on their way towards table on boot. The method naturally belongs the the table_population_metadata -- a helper class that in fact prepares the ground for the method in question. This PR moves the method into metadata class and removes whole lot of extra alias-references and private-fields exporting methods from it. Also it keeps start_subdir and populate_c._f. logic close to each other and relaxes several excessive checks from them. Closes #12847 * github.com:scylladb/scylladb: distributed_loader: Rename table_population_metadata distributed_loader: Dont check for directory presense twice distributed_loader: Move populate calls into metadata.start() distributed_loader: Remove local aliases and exporters distributed_loader: Move populate_column_family() into population meta	2023-02-15 22:55:48 +02:00
Yaron Kaikov	cbc005c6f5	Revert "dist/debian: drop unused Makefile variable" This reverts commit `d2e3a60428`. Since it's causing a regression preventing from Scylla service to start in deb OS Fixes: #12738 Closes #12857	2023-02-15 22:29:24 +02:00
Pavel Emelyanov	0c7efe38e1	distributed_loader: Rename table_population_metadata It used to be just metadata by providing the meta for population, now it does the population by itself, so rename it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-15 20:15:04 +03:00
Pavel Emelyanov	15926b22f4	distributed_loader: Dont check for directory presense twice Both start_subdir() and populate_subdir() check for the directory to exist with explicit file_exists() check. That's excessive, if the directory wasn't there in the former call, the latter can get this by checking the _sstable_directories map. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-15 20:15:04 +03:00
Pavel Emelyanov	eb477a13ad	distributed_loader: Move populate calls into metadata.start() This makes the metadata class export even shorter API, keeps the three sub-directories scanned in one place and allows removing the zero-shard assertion. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-15 20:15:04 +03:00
Nadav Har'El	ba18c318b9	Merge 'cql3: eliminate column_condition, streamline condition representation' from Avi Kivity column_condition is an LWT-specific boolean expression construct, but recent work allowed it to be re-expressed in terms of generic expressions. This series completes the work and eliminates the column_condition classes and source file. Furthermore, a statement's IF clause is represented as a single expression, rather than a vector of per-column conditions. Closes #12597 * github.com:scylladb/scylladb: cql3: modification_statement: unwrap unnecessary boolean_factors() call cql3: modification_statement: use single expression for conditions cql3: modification_statment: fix lwt null equality rules mangling cql3: broadcast tables: tighten checks on conditions cql3: grammar: communicate LWT IF conditions to AST as a simple expression cql3: column_condition: fold into modification_statement cql3: column_condition: inline column_condition_applies_to into its only caller cql3: column_condition: inline column_condition_collect_marker_specification into its only caller cql3: column_condition: eliminate column_condition class cql3: column_condition: move expression massaging to prepare() cql3: grammar: make columnCondition production return an expression cql3: grammar: eliminate duplication in LWT IF clause "IN (...)" vs "IN ?" cql3: grammar: remove duplication around columnCondition scalar/collection variants cql3: grammar: extract column references into a new production cql3: column_condition: eliminate column_condition::raw	2023-02-15 19:02:56 +02:00
Pavel Emelyanov	123a82adb2	distributed_loader: Remove local aliases and exporters After previous patch all local alias references in populate_column_family() are no longer requires. Neither are the exporting calls from the table_population_metadata class. Some non-obvious change is capturing 'this' instead of 'global_table' on calls that are cross-shard. That's OK, table_population_metadata is not sharded<> and is designed for cross-shard usage too. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-15 19:57:41 +03:00
Pavel Emelyanov	16fca3fa8a	distributed_loader: Move populate_column_family() into population meta This ownership change also requires the auto& = *this alias and extra specification where to call reshard() and reshape() from. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-15 19:57:41 +03:00
Kefu Chai	76355c056f	build: cmake: correct generate_cql_grammar should have escaped `&` with `\`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-16 00:07:40 +08:00
Kefu Chai	2718963a2a	build: cmake: extract idl out Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-16 00:07:40 +08:00
Kefu Chai	9416af8b80	build: cmake: link cql3 against xxHash turns out cql3 also indirectly uses the header file(s) which in turn includes xxhash header. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-16 00:07:40 +08:00
Kefu Chai	d6746fc49c	build: cmake: correct the check in Findlibdeflate.cmake otherwise libdeflate is never found. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-16 00:07:40 +08:00
Kefu Chai	1ac5932440	build: cmake: find_package(libdeflate) earlier so it can be linked by scylla Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-16 00:07:37 +08:00
Kefu Chai	bd1ea104fe	build: cmake: set more properties to alternator library alternator headers are exposed to the target which links against it, so let's expose them using the `target_include_directories()`. also, `alternator` uses Seastar library and uses xxHash indirectly. we should fix the latter by exposing the included header instead, but for now, let's just link alternator directly to xxHash. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-16 00:07:37 +08:00
Kefu Chai	a0f3c9ebf9	build: cmake: include generate_cql_grammar we should include "generate_cql_grammar.cmake" for using `generate_cql_grammar()` function. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-16 00:07:37 +08:00
Kefu Chai	b6a8341eef	build: cmake: find xxHash package we use private API in xxHash, it'd be handy to expose it in the form of a library target. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-16 00:07:37 +08:00
Kefu Chai	b234c839e4	build: cmake: add build mode support Scylla uses different build mode to customize the build for different purposes. in this change, instead of having it in a python dictionary, the customized settings are located in their own files, and loaded on demand. we don't support multi-config generator yet. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-16 00:07:37 +08:00
Kefu Chai	55b46ab1a3	date: use std::in_range() to check for invalid year for better readability, and to silence following warning from Clang 17: ``` /home/kefu/dev/scylladb/utils/date.h:5965:25: error: result of comparison of constant 9223372036854775807 with expression of type 'int' is always true [-Werror,-Wtautological-constant-out-of-range-compare] Y <= static_cast<int64_t>(year::max()))) ~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/kefu/dev/scylladb/utils/date.h:5964:57: error: result of comparison of constant -9223372036854775808 with expression of type 'int' is always true [-Werror,-Wtautological-constant-out-of-range-compare] if (!(static_cast<int64_t>(year::min()) <= Y && ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-15 22:56:49 +08:00
Kefu Chai	90981ebb50	date: drop implicitly generated ctor as one of its member variable does not have default constructor. this silences following warning from Clang-17: ``` /home/kefu/dev/scylladb/utils/date.h:708:5: error: explicitly defaulted default constructor is implicitly deleted [-Werror,-Wdefaulted-function-deleted] year_month_weekday() = default; ^ /home/kefu/dev/scylladb/utils/date.h:705:27: note: default constructor of 'year_month_weekday' is implicitly deleted because field 'wdi_' has no default constructor date::weekday_indexed wdi_; ^ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-15 22:56:49 +08:00
Gleb Natapov	9bdef9158e	raft: abort applier fiber when a state machine aborts After `5badf20c7a` applier fiber does not stop after it gets abort error from a state machine which may trigger an assertion because previous batch is not applied. Fix it. Fixes #12863	2023-02-15 15:54:19 +02:00
Gleb Natapov	dfcd56736b	raft: fix race in add_entry_on_leader that may cause incorrect log length accounting In add_entry_on_leader after wait_for_memory_permit() resolves but before the fiber continue to run the node may stop becoming the leader and then become a leader again which will cause currently hold units outdated. Detect this case by checking the term after the preemption.	2023-02-15 15:51:59 +02:00
Petr Gusev	b37eee26e1	test_shed_too_large_request: clarify the comments	2023-02-15 17:18:17 +04:00
Petr Gusev	4328f52242	test_shed_too_large_request: use smaller test string There was a vague comment about CI using larger limits for shedding. This turned out to be false, and the real reason of different limits is that Scylla handles the -m command line option differently in debug and release builds. Debug builds use the default memory allocator and the value of -m Scylla option is given to each shard. In release builds memory is evenly distributed between shards. To accommodate for this we read the current memory limit from Scylla metrics. The helper class ScyllaMetrics was introduced to handle metrics parsing logic. It can potentially be reused for dealing with metrics in other tests.	2023-02-15 17:18:10 +04:00
Avi Kivity	9454844751	cql3: modification_statement: unwrap unnecessary boolean_factors() call for_each_expression() will recurse anyway.	2023-02-15 14:21:26 +02:00
Avi Kivity	1d0854c0bc	cql3: modification_statement: use single expression for conditions Currently, we use two vectors for static and regular column conditions, each element referring to a single column. There's a comment that keeping them separate makes things simpler, but in fact we always treat both equally (except in one case where we look at just the regular columns and check that no static column conditions exist). Simplify by storing just a single expression, which can be a conjunction of mulitple column conditions. add_condition() is renamed to analyze_condition(), since it now longers adds to the vectors.	2023-02-15 14:21:26 +02:00
Avi Kivity	5cb7655a9f	cql3: modification_statment: fix lwt null equality rules mangling search_and_replace() needs to return std::nullopt when it doesn't match, or it doesn't recurse properly. Currently it doesn't break anything because we only call the function on a binary_operator, but soon it will.	2023-02-15 14:21:26 +02:00
Avi Kivity	c50c9c86b3	cql3: broadcast tables: tighten checks on conditions We don't support checks on static columns in broadbast tables, so explicitly reject them.	2023-02-15 14:21:26 +02:00
Avi Kivity	4d125bffdf	cql3: grammar: communicate LWT IF conditions to AST as a simple expression Instead of passing a vector of boolean factors, pass a single expression (a conjunction). This prepares the way for more complex expressions, but no grammar changes are made here. The expression is stored as optional, since we'll need a way to indicate whether an IF clause was supplied or not. We could play games with boolean_factors(), but it becomes too tricky. The expressions are broken down back to boolean factors during prepare. We'll later consolidate them too.	2023-02-15 14:21:26 +02:00
Avi Kivity	23bd7d24df	cql3: column_condition: fold into modification_statement Move column_condition_prepare() and its helper function into modification_statement, its only caller. The column_condition.{cc,hh} now become empty, so remove them. This eliminates the column_condition concept, which was just a custom expression, in favor of generic expressions. It still has custom properties due to LWT specialness, but those custom properties are isolated in column_condition_prepare().	2023-02-15 14:21:24 +02:00
Avi Kivity	12be5d4208	cql3: column_condition: inline column_condition_applies_to into its only caller This two-liner can be trivilly inlined with no loss of meaning. Indeed it's less confusing, because "applies_to" became less meaningful once we integrated the column_value component into the expression.	2023-02-15 14:19:55 +02:00
Avi Kivity	82fb838a70	cql3: column_condition: inline column_condition_collect_marker_specification into its only caller This one-liner can be trivilly inlined with no loss of meaning.	2023-02-15 14:19:55 +02:00
Avi Kivity	e7b9d9dab9	cql3: column_condition: eliminate column_condition class It's become a wrapper around expression, so peel it off. The methods are converted free functions, with the intent to later inline them into their callers, as they are also mostly just wrappers.	2023-02-15 14:19:55 +02:00
Avi Kivity	4e93cf9ae9	cql3: column_condition: move expression massaging to prepare() Move logic out of the column_condition constructor so it becomes a trivial wrapper, ripe for elimination.	2023-02-15 14:19:55 +02:00
Avi Kivity	31e37ff559	cql3: grammar: make columnCondition production return an expression Instead of appending to a vector, just return an expression. This makes the production self-sufficient and more natural to use.	2023-02-15 14:19:55 +02:00
Avi Kivity	d8d4d0bd72	cql3: grammar: eliminate duplication in LWT IF clause "IN (...)" vs "IN ?" The IN operator recognition is duplicated; de-duplicate it by introducing the (somewhat artificial) singleColumnInValuesOrMarkerExpr production.	2023-02-15 14:19:55 +02:00
Avi Kivity	c47cf9858b	cql3: grammar: remove duplication around columnCondition scalar/collection variants columnCondition duplicates the grammar for scalar relations and subscripted collection relations. Eliminate the duplication by introducing a subscriptExpr production, which encapsulates the differences.	2023-02-15 14:19:55 +02:00
Avi Kivity	74da77f442	cql3: grammar: extract column references into a new production Eliminate repetition by creating a new columnRefExpr and referring to it. Only LWT IF is updated so far. No grammar changes.	2023-02-15 14:19:55 +02:00
Avi Kivity	4d7d3c78a2	cql3: column_condition: eliminate column_condition::raw It's now a thin wrapper around an expression, so peel the wrapper and keep just the expression. A boolean expression is, after all, a condition, and we'll make the condition statement-wide soon rather than apply just to a column.	2023-02-15 14:19:55 +02:00
guy9	4dd14af7d5	Adding ScyllaDB University LIVE Q1 2023 to Docs top banner Closes #12860	2023-02-15 13:15:30 +02:00
Nadav Har'El	2d6c53c047	test/cql-pytest: reproduce bug in static-column index lookup This patch adds a reproducer to a static-column index lookup bug described in issue #12829: The restriction `where pk=0 and s=1 and c=3` where pk,c are the primary key and s is an indexed static column, results in an internal error: "clustering column id 2 >= 2". Unfortunately, because on_internal_error() crashes Scylla in debug mode, we need to mark this failing test with skip instead of xfail. Refs #12829 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12852	2023-02-15 12:23:36 +02:00
Benny Halevy	bb36237cf4	topology: optimize compare_endpoints This function is called on the fast data path from storage_proxy when sorting multiple endpoints by proximity. This change calculates numeric node diff metrics based on each address proximity to a given node (by <dc, rack, same node>) to eliminate logic branches in the function and reduce its footprint. based on objdump -d output, compare_endpoints footprint was reduced by 58.5% (3632 / 8752 bytes) with clang version 15.0.7 (Fedora 15.0.7-1.fc37) Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-15 11:48:24 +02:00
Benny Halevy	3ac2df9480	to_string: add print operators for std::{weak,partial}_ordering Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-15 11:09:04 +02:00
Benny Halevy	bd6f88c193	utils: to_sstring: deinline std::strong_ordering print operator Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-15 11:09:04 +02:00
Benny Halevy	25ebc63b82	move to_string.hh to utils/ Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-15 11:09:04 +02:00
Benny Halevy	e7af35a64d	test: network_topology: add test_topology_compare_endpoints Add a regression unit test for topology::compare_endpoint before it's optimized in the following patches. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-02-15 11:09:02 +02:00
Avi Kivity	69a385fd9d	Introduce schema/ module Schema related files are moved there. This excludes schema files that also interact with mutations, because the mutation module depends on the schema. Those files will have to go into a separate module. Closes #12858	2023-02-15 11:01:50 +02:00
Petr Gusev	1f850374fa	test_shed_too_large_request fix: disable compression The test relies on exact request size, this doesn't work if compression is applied. The driver enables compression only if both the server and the client agree on the codec to use. If compression package (e.g. lz4) is not installed, the compression is not used. The trick with locally_supported_compressions is needed since I couldn't find any standard means to disable compression other than the compression flag on the cluster object, which seemed too broad. Fixes #12836	2023-02-15 11:55:49 +04:00
Nadav Har'El	c0114d8b02	test/cql-pytest: test another case of ALLOW FILTERING In issue #12828 it was noted that Scylla requires ALLOW FILTERING for `where b=1 and c=1` where b is an indexed static column and c is a clustering key, and it was suggested that this is a bug. This patch adds a test that confirms that both Scylla and Cassandra require ALLOW FILTERING in this case. We explain in a comment that this requirement is expected (i.e., it's not a bug), as the `b=1` may match a huge number of rows, and the `c=1` may further match just a few of those - i.e., it is filtering. This test is virtually identical to the test we already had for `where a=1 and c=1` - when `a` is an indexed regular column. There too, the ALLOW FILTERING is required. Closes #12828 as "not a bug". Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12848	2023-02-15 08:43:19 +02:00
Raphael S. Carvalho	ba022f7218	replica: compaction_group: Use sstable_set::size() More efficient than retrieving size from sstable_set::all() which may involve copy of elements. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #12835	2023-02-15 06:53:04 +02:00
Avi Kivity	19edaa9b78	Merge 'build: cmake: sync with configure.py' from Kefu Chai this is the first step to reenable cmake to build scylla, so we can experiment C++20 modules and other changes before porting them to `configure.py` . please note, this changeset alone does not address all issues yet. as this is a low priority project, i want to do this in smaller (or tiny!) steps. * build: cmake: s/Abseil/absl/ * build: cmake: sync with source files compiled in configure.py * build: cmake: do not generate crc_combine_table at build time * build: cmake: use packaged libdeflate Closes #12838 * github.com:scylladb/scylladb: build: cmake: add rust binding build: cmake: extract cql3 and alternator out build: cmake: use packaged libdeflate build: cmake: do not generate crc_combine_table at build time build: cmake: sync with source files compiled in configure.py build: cmake: s/Abseil/absl/	2023-02-14 22:37:10 +02:00
Avi Kivity	df497a5a94	Merge 'treewide: remove implicitly deleted copy ctor and assignment operator' from Kefu Chai clang 17 trunk helped to identify these issues. so let's fix them. Closes #12842 * github.com:scylladb/scylladb: row_cache: drop defaulted move assignment operator utils/histogram: drop defaulted copy ctor and assignment operator range_tombstone_list: remove defaulted move assignment operator query-result: remove implicitly deleted copy ctor	2023-02-14 20:24:26 +02:00
Kefu Chai	95f8b4eab1	build: cmake: add rust binding Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-14 23:54:20 +08:00
Kefu Chai	f8671188c7	build: cmake: extract cql3 and alternator out Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-14 23:54:20 +08:00
Aleksandra Martyniuk	7b5e653fc9	test: use functions from task_manager_utils.py in test_task_manager.py	2023-02-14 13:34:11 +01:00
Aleksandra Martyniuk	02931163ef	test: add task_manager_utils.py Task manager api will be used in many tests. Thus, to make it easier api calls to task manager are wrapped into functions in task_manager_utils.py. Some helpers that may be reused in other tests are moved there too.	2023-02-14 13:34:04 +01:00
Kefu Chai	9ea8a46dd6	build: cmake: use packaged libdeflate this mirrors the change in `b8b78959fb` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-14 19:25:02 +08:00
Kefu Chai	89542232c9	row_cache: drop defaulted move assignment operator as it has a reference type member variable. and Clang 17 warns at seeing this ``` /home/kefu/dev/scylladb/row_cache.hh:359:16: warning: explicitly defaulted move assignment operator is implicitly deleted [-Wdefaulted-function-deleted] row_cache& operator=(row_cache&&) = default; ^ /home/kefu/dev/scylladb/row_cache.hh:214:20: note: move assignment operator of 'row_cache' is implicitly deleted because field '_tracker' is of reference type 'cache_tracker &' cache_tracker& _tracker; ^ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-14 19:22:19 +08:00
Kefu Chai	68327123ac	utils/histogram: drop defaulted copy ctor and assignment operator as one of the (indirected) member variables has a user-declared move ctor, this prevents the compiler from generating the default copy ctor or assignment operator for the classes containing `timer`. ``` /home/kefu/dev/scylladb/utils/histogram.hh:440:5: warning: explicitly defaulted copy constructor is implicitly deleted [-Wdefaulted-function-deleted] timed_rate_moving_average_and_histogram(const timed_rate_moving_average_and_histogram&) = default; ^ /home/kefu/dev/scylladb/utils/histogram.hh:437:31: note: copy constructor of 'timed_rate_moving_average_and_histogram' is implicitly deleted because field 'met' has a deleted copy constructor timed_rate_moving_average met; ^ /home/kefu/dev/scylladb/utils/histogram.hh:298:17: note: copy constructor of 'timed_rate_moving_average' is implicitly deleted because field '_timer' has a deleted copy constructor meter_timer _timer; ^ /home/kefu/dev/scylladb/utils/histogram.hh:212:13: note: copy constructor of 'meter_timer' is implicitly deleted because field '_timer' has a deleted copy constructor timer<> _timer; ^ /home/kefu/dev/scylladb/seastar/include/seastar/core/timer.hh:111:5: note: copy constructor is implicitly deleted because 'timer<>' has a user-declared move constructor timer(timer&& t) noexcept : _sg(t._sg), _callback(std::move(t._callback)), _expiry(std::move(t._expiry)), _period(std::move(t._period)), ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-14 19:22:19 +08:00
Kefu Chai	b13caeedda	range_tombstone_list: remove defaulted move assignment operator as `range_tombstone_list::reverter` has a member variable of `const schema& _s`, which cannot be mutated, so it is not allowed to have an assignment operator. this change should address the warning from Clang 17: ``` /home/kefu/dev/scylladb/range_tombstone_list.hh:122:19: warning: explicitly defaulted move assignment operator is implicitly deleted [-Wdefaulted-function-deleted] reverter& operator=(reverter&&) = default; ^ /home/kefu/dev/scylladb/range_tombstone_list.hh:111:23: note: move assignment operator of 'reverter' is implicitly deleted because field '_s' is of reference type 'const schema &' const schema& _s; ^ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-14 19:22:19 +08:00
Kefu Chai	f36fdff622	query-result: remove implicitly deleted copy ctor as one of the (indirect) member variables of `query::result` is not copyable, compiler refuses to create a copy ctor or an assignment operator for us, an Clang 17 warns at seeing this. so let's just drop them for better readability and more importantly to preserve the correctness. ``` /home/kefu/dev/scylladb/query-result.hh:385:5: warning: explicitly defaulted copy constructor is implicitly deleted [-Wdefaulted-function-deleted] result(const result&) = default; ^ /home/kefu/dev/scylladb/query-result.hh:321:34: note: copy constructor of 'result' is implicitly deleted because field '_memory_tracker' has a deleted copy constructor query::result_memory_tracker _memory_tracker; ^ /home/kefu/dev/scylladb/query-result.hh:97:23: note: copy constructor of 'result_memory_tracker' is implicitly deleted because field '_units' has a deleted copy constructor semaphore_units<> _units; ^ /home/kefu/dev/scylladb/seastar/include/seastar/core/semaphore.hh:500:5: note: 'semaphore_units' has been explicitly marked deleted here semaphore_units(const semaphore_units&) = delete; ^ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-14 19:22:19 +08:00
Avi Kivity	4f5a460db9	Update seastar submodule * seastar 943c09f869...9b6e181e42 (34): > semaphore: disallow move after used > Revert "semaphore: assert no outstanding units when moved" > reactor, tests: drop unused include > spawn_test: prolong termination time to be more tolerant. > net: s/offload_info()/get_offload_info()/ > Merge 'Extend http client with keep-alive connections' from Pavel Emelyanov > util/gcc6-concepts.hh: drop gcc6-concepts.hh > treewide: do not inline tls variables in shared library > reactor: Remove --num-io-queues option > build: correct the comment > smp: do not inline function when BUILD_SHARED_LIBS > iostream: always flush _fd in do_flush > thread_pool: prevent missed wakeup when the reactor goes to sleep in parallel with a syscall completion > Merge 'build: do not always build seastar as a static library' from Kefu Chai > Revert "Merge 'Keep outgoing queue all cancellable while negotiating' from Pavel Emelyanov" > Merge 'Keep outgoing queue all cancellable while negotiating' from Pavel Emelyanov > memcached: prolong expiration time to be more tolerant > treewide: add non-seastar "#include"s > Merge 'Allow multiple abort requests' from Aleksandra Martyniuk > app-template: remove duplicated includes > include/seastar: s/SEASTAR_NODISCARD/[[nodiscard]]/ > prometheus: Don't report labels that starts with __ > memory: do not define variable only for assert > reactor: set_shard_field_width() after resource::allocate() > Merge 'reactor, core/resource: clean ups' from Kefu Chai > util/concepts: include <concepts> > build: use target_link_options() to pass options to linker > iostream: add doxygen comment for eof() > Merge 'util/print_safe, reactor: use concept for type constraints and refactory ' from Kefu Chai > Right align the memory diagnostics > Merge 'Add an API for the metrics layer to manipulate metrics dynamically.' from Amnon Heiman > semaphore: assert no outstanding units when moved > build: do not populate package registry by default > build: stop detecting concepts support Closes #12827	2023-02-14 13:04:17 +02:00
Avi Kivity	c5e4bf51bd	Introduce mutation/ module Move mutation-related files to a new mutation/ directory. The names are kept in the global namespace to reduce churn; the names are unambiguous in any case. mutation_reader remains in the readers/ module. mutation_partition_v2.cc was missing from CMakeLists.txt; it's added in this patch. This is a step forward towards librarization or modularization of the source base. Closes #12788	2023-02-14 11:19:03 +02:00
Kefu Chai	e2a20a108f	tools: toolchain: dbuild: reindent a "case" block to replace tabs with spaces, for better readability if the editor fails to render tabs with the right tabstop setting. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12839	2023-02-14 10:37:25 +02:00
Raphael S. Carvalho	d6fe99abc4	replica: table: Update stats for newly added SSTables Patch `55a8421e3d` fixed an inefficiency when rebuilding statistics with many compaction groups, but it incorrectly removed the update for newly added SSTables. This patch restores it. When a new SSTable is added to any of the groups, the stats are incrementally updated (as before). On compaction completion, statistics are still rebuilt by simply iterating through each group, which keeps track of its own stats. Unit tests are added to guarantee the stats are correct both after compaction completion and memtable flush. Fixes #12808. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #12834	2023-02-14 10:28:53 +02:00
Wojciech Mitros	cab5b08948	git: remove Cargo.lock from .gitignore When rust wasmtime bindings were added, we commited Cargo.lock to make sure a given version of Scylla always builds using the same versions of rust dependencies. Therefore, it should not be present in .gitignore. Closes #12831	2023-02-14 08:51:53 +02:00
Wojciech Mitros	8b756cb73f	rust: update dependencies Wasmtime added some improvements in recent releases - particularly, two security issues were patched in version 2.0.2. There were no breaking changes for our use other than the strategy of returning Traps - all of them are now anyhow::Errors instead, but we can still downcast to them, and read the corresponding error message. The cxx, anyhow and futures dependency versions now match the versions saved in the Cargo.lock. Closes #12830	2023-02-14 08:51:20 +02:00
Nadav Har'El	14cdd034ee	test/alternator: fix flaky test for partition-tombstone scan The test test_scan.py::test_scan_long_partition_tombstone_string checks that a full-table Scan operation ends a page in the middle of a very long string of partition tombstones, and does NOT scan the entire table in one page (if we did that, getting a single page could take an unbounded amount of time). The test is currently flaky, having failed in CI runs three times in the past two months. The reason for the flakiness is that we don't know exactly how long we need to make the sequence of partition tombstones in the test before we can be absolutely sure a single page will not read this entire sequence. For single-partition scans we have the "query_tombstone_page_limit" configuration parameter, which tells us exactly how long we need to make the sequence of row tombstones. But for a full-table scan of partition tombstones, the situation is more complicated - because the scan is done in parallel on several vnodes in parallel and each of them needs to read query_tombstone_page_limit before it stops. In my experiments, using query_tombstone_limit * 4 consecutive tombstones was always enough - I ran this test hundreds of times and it didn't fail once. But since it did fail on Jenkins very rarely (3 times in the last two months), maybe the multiplier 4 isn't enough. So this patch doubles it to 8. Hopefully this would be enough for anyone (TM). This makes this test even bigger and slower than it was. To make it faster, I changed this test's write isolation mode from the default always_use_lwt to forbid_rmw (not use LWT). This leaves the test's total run time to be similar to what it was before this patch - around 0.5 seconds in dev build mode on my laptop. Fixes #12817 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12819	2023-02-14 08:09:44 +02:00
Kefu Chai	cec2e2f993	build: cmake: do not generate crc_combine_table at build time mirrors the change in `70217b5109` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-14 11:42:08 +08:00
Kefu Chai	a8fca52398	build: cmake: sync with source files compiled in configure.py these source files are out of sync with the source files listed in `configured.py`. some of them were removed, some of them were added. let's try to keep them in sync. this pave the road to a working CMakeLists.txt Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-14 11:42:04 +08:00
Kefu Chai	50ff27514c	build: cmake: s/Abseil/absl/ find abseil library with the name of absl, instead of "Abseil". absl's cmake config file is provided with the name of `abslConfig.cmake`, not `AbseilConfig.cmake`. see also `cde2f0eaae/CMakeLists.txt (L198)` . Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-14 11:41:59 +08:00
Nadav Har'El	310638e84d	Merge 'wasm: deserialize counters as integers' from Wojciech Mitros Currently, because serialize_visitor::operator() is not implemented for counters, we cannot convert a counter returned by a WASM UDF to bytes when returning from wasm::run_script(). We could disallow using counters as WASM UDF return types, but an easier solution which we're already using in Lua UDFs is treating the returned counters as 64-bit integers when deserializing. This patch implements the latter approach and adds a test for it. Closes #12806 * github.com:scylladb/scylladb: wasm udf: deserialize counters as integers test_wasm.py: add utility function for reading WASM UDF saved in files	2023-02-13 19:24:11 +02:00
Nadav Har'El	6a45881d22	Merge 'functions: handle replacing UDFs used in UDAs' from Wojciech Mitros This patch is based on #12681, only last 3 commits are relevant. As described in #12709, currently, when a UDF used in a UDA is replaced, the UDA is not updated until the whole node is restarted. This patch fixes the issue by updating all affected UDAs when a UDF is replaced. Additionally, it includes a few convenience changes Closes #12710 * github.com:scylladb/scylladb: uda: change the UDF used in a UDA if it's replaced functions: add helper same_signature method uda: return aggregate functions as shared pointers	2023-02-13 16:30:24 +02:00
Benny Halevy	b2d3c1fcc2	abstract_replication_strategy: add for_each_natural_endpoint_until Currently, effective_replication_map::do_get_ranges accepts a functor that traverses the natural endpoints of each token to decide whether a token range should be returned or not. This is done by copying the natural endpoints vector for each token. However, other than special strategies like everywhere and local, the functor can be called on the precalculated inet_address_vector_replica_set in the replication_map and there's no need to copy it for each call. for_each_natural_endpoint_until passes a reference to the function down to the abstract replication strategy to let it work either on the precalculated inet_address_vector_replica_set or on a ad-hoc vector prepared by the replication strategy. The function returns stop_iteration::yes when a match or mismatch are found, or stop_iteration::no while it has no definite result. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #12737	2023-02-13 16:30:24 +02:00
Nadav Har'El	efed973dd3	Merge 'cql3: convert LWT IF clause to expressions' from Avi Kivity LWT `IF` (column_condition) duplicates the expression prepare and evaluation code. Annoyingly, LWT IF semantics are a little different than the rest of CQL: a NULL equals NULL, whereas usually NULL = NULL evaluates to NULL. This series converts `IF` prepare and evaluate to use the standard expression code. We employ expression rewriting to adjust for the slightly different semantics. In a few places, we adjust LWT semantics to harmonize them with the rest of CQL. These are pointed out in their own separate patches so the changes don't get lost in the flood. Closes #12356 * github.com:scylladb/scylladb: cql3: lwt: move IF clause expression construction to grammar cql3: column_condition: evaluate column_condition as a single expression cql3: lwt: allow negative list indexes in IF clause cql3: lwt: do not short-circuit col[NULL] in IF clause cql3: column_condition: convert _column to an expression cql3: expr: generalize evaluation of subscript expressions cql3: expr: introduce adjust_for_collection_as_maps() cql3: update_parameters: use evaluation_inputs compatible row prefetch cql3: expr: protect extract_column_value() from partial clustering keys cql3: expr: extract extract_column_value() from evaluation machinery cql3: selection: introduce selection_from_partition_slice cql3: expr: move check for ordering on duration types from restrictions to prepare cql3: expr: remove restrictions oper_is_slice() in favor of expr::is_slice() cql3: column_condition: optimize LIKE with constant pattern after preparing cql3: expr: add optimizer for LIKE with constant pattern test: lib: add helper to evaluate an expression with bind variables but no table cql3: column_condition: make the left-hand-side part of column_condition::raw cql3: lwt: relax constraints on map subscripts and LIKE patterns cql3: expr: fix search_and_replace() for subscripts cql3: expr: fix function evaluation with NULL inputs cql3: expr: add LWT IF clause variants of binary operators cql3: expr: change evaluate_binop_sides to return more NULL information	2023-02-13 16:30:24 +02:00
Nadav Har'El	621c49b621	test/alternator: more tests for listing streams In issue #12601, a dtest involving paging of ListStreams showed incorrect results - the paged results had one duplicate stream and one missing stream. We believe that the cause of this bug was that the unsorted map of tables can change order between pages. In this patch we add a test test_list_streams_paged_with_new_table which can demonstrate this bug - by adding a lot of tables in mid-paging, we cause the unsorted map to be reshufled and the paging to break. This is not the same situation as in #12601 (which did not involve new tables) but we believe it demonstrates the same bug - and check its fix. Indeed this passes with the fix in pull request #12614 and fails without it. This patch also adds a second test, test_stream_arn_unchanging: That test eliminates a guess we had for the cause of #12601. We thought that maybe stream ARN changing on a table if its schema version changes, but the new test confirms that it actually behaves as expected (the stream ARN doesn't change). Refs #12601 Refs #12614 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12616	2023-02-13 16:30:24 +02:00
Nadav Har'El	25610c81fb	test/cql-pytest: another reproducer for index+limit+filtering bug This patch adds yet another reproducer for issue #10649, where a the combination of filtering and LIMIT returns fewer results when a secondary index is added to the table. Whereas the previous tests we had for this issue involved a regular (global) index, the new test uses a local index (a Scylla-only feature). It shows that the same bug exists also for local indexes, as noticed by a user in #12766. Refs #10649 Refs #12766 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12783	2023-02-13 16:30:24 +02:00
Botond Dénes	e29e836aca	docs/operating-scylla: add a document on diagnostic tools ScyllaDB has wide variety of tools and source of information useful for diagnosing problems. These are scattered all over the place and although most of these are documented, there is currently no document listing all the relevant tools and information sources when it comes to diagnosing a problem. This patch adds just that: a document listing the different tools and information sources, with a brief description of how they can help in diagnosing problems, and a link to the releveant dedicated documentation pages. Closes #12503	2023-02-13 16:30:24 +02:00
Botond Dénes	e55f475db1	Merge 'test/pylib: use larger timeout for decommission/removenode' from Kamil Braun Recently we enabled RBNO by default in all topology operations. This made the operations a bit slower (repair-based topology ops are a bit slower than classic streaming - they do more work), and in debug mode with large number of concurrent tests running, they might timeout. The timeout for bootstrap was already increased before, do the same for decommission/removenode. The previously used timeout was 300 seconds (this is the default used by aiohttp library when it makes HTTP requests), now use the TOPOLOGY_TIMEOUT constant from ScyllaServer which is 1000 seconds. Closes #12765 * github.com:scylladb/scylladb: test/pylib: use larger timeout for decommission/removenode test/pylib: scylla_cluster: rename START_TIMEOUT to TOPOLOGY_TIMEOUT	2023-02-13 16:30:24 +02:00
Kefu Chai	08b7e8b807	configure.py: use seastar_dep and seastar_testing_dep now that these variables are set, let's reuse them when appropriate. less repeatings this way. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12802	2023-02-13 16:30:24 +02:00
Nadav Har'El	ecfcb93ef5	test/cql-pytest: regression test for old bug of misused index Issue #7659, which we solved long ago, was about a query which included a non-EQ restriction and wrongly picked up one of the indexes. It had a short C++ regression test, but here we add a more elaborate Python test for the same bug. The advantages of the Python test are: 1. The Python test can be run against any version of Scylla (e.g., to whether a certain version contains a backport of the fix). 2. The Python test reproduces not only a "benign" query error, but also an assertion-failed crash which happened when the non-EQ restriction was an "IN". 3. The Python test reproduces the same bug not just for a regular index, but also a local index. I checked that, as expected, these tests pass on master, but fail (and crash Scylla) in old branches before the fix for #7659. Refs #7659. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12797	2023-02-13 16:30:24 +02:00
Takuya ASADA	7e690bac62	install-dependencies.sh: update node_exporter to 1.5.0 Update node_exporter to 1.5.0. Closes scylladb/scylla-pkg#3190 Closes #12793 [avi: regenerate frozen toolchain] Closes #12813	2023-02-13 16:30:24 +02:00
Pavel Emelyanov	fa5f5a3299	sstable_test_env: Remove working_sst helper It's only used by the single test and apparently exists since the times seastar was missing the future::discard_result() sugar Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #12803	2023-02-13 16:30:24 +02:00
Wojciech Mitros	b25ee62f75	wasm udf: deserialize counters as integers Currently, because serialize_visitor::operator() is not implemented for counters, we cannot convert a counter returned by a WASM UDF to bytes when returning from wasm::run_script(). We could disallow using counters as WASM UDF return types, but an easier solution which we're already using in Lua UDFs is treating the returned counters as 64-bit integers when deserializing. This patch implements the latter approach and adds a test for it.	2023-02-13 14:24:20 +01:00
Wojciech Mitros	3b8bf1ae3a	test_wasm.py: add utility function for reading WASM UDF saved in files Currently, we're repeating the same os.path, open, read, replace each time we read a WASM UDF from a file. To reduce code bloat, this patch adds a utility function "read_function_from_file" that finds the file and reads it given a function name and an optional new name, for cases when we want to use a different name in cql (mostly for unique_names).	2023-02-13 14:24:20 +01:00
Nadav Har'El	a24600a662	Merge 'test/pylib: split and refactor topology tests' from Alecco Move long running topology tests out of `test_topology.py` and into their own files, so they can be run in parallel. While there, merge simple schema tests. Closes #12804 * github.com:scylladb/scylladb: test/topology: rename topology test file test/topology: lint and type for topology tests test/topology: move topology ip tests to own file test/topology: move topology test remove garbaje... test/topology: move topology rejoin test to own file test/topology: merge topology schema tests and... test/topology: isolate topology smp params test test/topology: move topology helpers to common file	2023-02-12 17:53:48 +02:00
Avi Kivity	87c0d09d03	cql3: lwt: move IF clause expression construction to grammar Instead of the grammar passing expression bits to column_condition, have the grammar construct an unprepared expression and pass it as a whole. column_condition::raw then uses prepare_expression() to prepare it. The call to validate_operation_on_durations() is eliminated, since it's already done be prepare_expression(). Some tests adjusted for slightly different wording.	2023-02-12 17:28:36 +02:00
Avi Kivity	37c9c46101	cql3: column_condition: evaluate column_condition as a single expression Instead of laboriously hand-evaluating each expression component, construct one expression for the entire column_condition during prepare time, and evaluate it using the generic machinery. LWT IF evaluates equality against NULL considering two NULLs as equal. We handle that by rewriting such expressions to use null_handling_style::lwt_nulls. Note we use expr::evaluate() rather than is_satisfied_by(), since the latter doesn't like functions on the top-level, which we have due to LIKE with constant pattern optimization. evaluate() is more generic anyway.	2023-02-12 17:28:05 +02:00
Avi Kivity	8e972b52c5	cql3: lwt: allow negative list indexes in IF clause LWT IF clause errors out on negative list index. This deviates from non-LWT subscript evaluation, PostgresQL, and too-large index, all of which evaluate the subscript operation to NULL. Make things more consistent by also evaluating list[-1] to NULL. A test is adjusted.	2023-02-12 17:28:05 +02:00
Avi Kivity	433b778a4d	cql3: lwt: do not short-circuit col[NULL] in IF clause Currently if an LWT IF clause contains a subscript with NULL as the key, then the entire IF clause is evaluated as FALSE. This is incorrect, because col[NULL] = NULL would simplify to NULL = NULL, which is interpreted as TRUE using the LWT comparisons. Even with SQL NULL handling, "col[NULL] IS NULL" should evaluate to true, but since we short-circuit as soon as we encounter the NULL key, we cannot complete the evaluation. Fix by setting cell_value to null instead of returning immediately. Tests that check for this were adjusted. Since the test changed behavior from not applying the statement to applying it, a new statement is added that undoes the previous one, so downstream statements are not affected.	2023-02-12 17:28:05 +02:00
Avi Kivity	b888e3d26a	cql3: column_condition: convert _column to an expression After this change, all components of column_condition are expressions. One LWT-specific hack was removed from the evaluation path: - lists being represented as maps is made transparent by converting during evaluation with adjust_for_collections_as_maps() column_condition::applies_to() previously handled a missing row by materializing a NULL for the column being evaluated; now it materializes a NULL row instead, since evaluation of the column is moved to common code. A few more cases in lwt_test became legal, though I'm not sure exactly why in this patch.	2023-02-12 17:28:01 +02:00
Avi Kivity	568c1a5a36	cql3: expr: generalize evaluation of subscript expressions Currently, evaluation of a subscript expression x[y] requires that x be a column_value, but that's completely artificial. Generalize it to allow any expression. This is needed after we transform a LWT IF condition from "a[x] = y" to "func(a)[x] = y", where func casts a from a map represention of a list back to a list; but it's also generally useful.	2023-02-12 17:25:46 +02:00
Avi Kivity	6de4032baf	cql3: expr: introduce adjust_for_collection_as_maps() LWT and some list operations represent lists using a form like their mutations, so that the mutation list keys can be recovered and used to update the list. But the evaluation machinery knows nothing about that, and will return the map-form even though the type system thinks it is a list. To handle that, add a utility to rewrite the expression so that the value is re-serialized into the expected list form. The rewrite is implemented as a scalar function taking the map form and returning the list form.	2023-02-12 17:25:46 +02:00
Avi Kivity	3a2d8175fb	cql3: update_parameters: use evaluation_inputs compatible row prefetch update_parameters::prefetch_data is used for some list updates (which need a read-before-write to determine the key to update) and for LWT compare-and-swap. Currently they use a custom structure for representing a read row. Switch to the same structure that is used in evaluation_inputs (and in SELECT statement evaluation) to the expression machinery can be reused. The expression representation is irregular (with different fields for the keys and regular/static columns), so we introduce an old_row structure to hold both the clustering key and the regular row values for cas_request. A nice bonus is that we can use get_non_pk_values() to read the data into the format expected by evaluation_inputs, but on the other hand we have to adjust get_prefetched_list() to fix up the type of the returned list (we return it as a map, not a list, so list updates can access the index).	2023-02-12 17:25:41 +02:00
Avi Kivity	47026b7ee0	cql3: expr: protect extract_column_value() from partial clustering keys Partial clustering keys can exist in COMPACT STORAGE tables (though they are exceedingly rare), and when LWT materializes a static row. Harden extract_column_value() so it is ready for them.	2023-02-12 17:17:01 +02:00
Avi Kivity	c8d77c204f	cql3: expr: extract extract_column_value() from evaluation machinery Expression evaluation works with the evaluation_input structure to compute values. As we move LWT column_condition towards expressions, we'll start using evaluation_input, so provide this helper to ease the transition.	2023-02-12 17:17:01 +02:00
Avi Kivity	721c05b7ec	cql3: selection: introduce selection_from_partition_slice Since expressions were introduced for SELECT statements, they work with `selection` object to represent which table columns they can work with. Probably a neutral representation would have been better, but that's what we have now. LWT works with partition_slice, so introduce a selection_from_partition_slice() helper to bridge the two worlds.	2023-02-12 17:17:01 +02:00
Avi Kivity	31ee13c0c9	cql3: expr: move check for ordering on duration types from restrictions to prepare Both LWT IF clause and SELECT WHERE clause check that a duration type isn't used in an ordered comparison, since duration types are unordered (is 1mo more or less than 30d?). As a first step towards centralizing this check, move the check from restrictions into prepare. When LWT starts using prepare, the duplication will be removed. The error message was changed: the word "slice" is an internal term, and a comparison does not necessarily have to be in a restriction (which is also an internal term). Tests were adjusted.	2023-02-12 17:17:01 +02:00
Avi Kivity	c0b1992fc4	cql3: expr: remove restrictions oper_is_slice() in favor of expr::is_slice() The two are functionally identical, so eliminate duplicate code.	2023-02-12 17:17:01 +02:00
Avi Kivity	036fa0891f	cql3: column_condition: optimize LIKE with constant pattern after preparing This just moves things around to put all the code we will kill in one place. Note the code was adjusted: before the move, it operated on an unprepared untyped_constant; after the move it operates on a prepared constant.	2023-02-12 17:17:01 +02:00
Avi Kivity	db2fa44a9a	cql3: expr: add optimizer for LIKE with constant pattern Compiling a pattern is expensive and so we should try to do it at prepare time, if the pattern is a constant. Add an optimizer that looks for such cases and replaces them with a unary function that embeds the compiled pattern. This isn't integrated yet with prepare_expr(), since the filtering code isn't ready for generic expressions. Its first user will be LWT, which contains the optimization already (filtering had it as well, but lost it sometime during the expression rewrite). A unit test is added.	2023-02-12 17:16:58 +02:00
Avi Kivity	1959f9937c	test: lib: add helper to evaluate an expression with bind variables but no table Sometimes we want to defeat the expression optimizer's ability to fold constant expressions. A bind variable is a convenient way to do this, without the complexity of faking a schema and row inputs. Add a helper to evaluate an expression with bind variable parameters, doing all the paperwork for us. A companion make_bind_variable() is added to likewise simplify creating bind variables for tests.	2023-02-12 17:05:22 +02:00
Avi Kivity	899c4a7f29	cql3: column_condition: make the left-hand-side part of column_condition::raw LWT IF conditions are collected with the left-hand-side outside the condition structure, then moved back to the prepared condition structure during preparation. Change that so that the raw description also contains the left-hand-side. This makes it more similar to expressions (which LWT conditions aspire to be). The change is mechanical; a bit of code that used to manage the std::pair is moved to column_condition::raw::prepare instead. The schema is now also passed since it's needed to prepare the left-hand-side.	2023-02-12 17:05:22 +02:00
Avi Kivity	f5257533fd	cql3: lwt: relax constraints on map subscripts and LIKE patterns Previously, we rejected map subscripts that are NULL, as well as LIKE patterns that are NULL. General SQL expression evaluation allows NULL everywhere, and doesn't raise errors - an expression involving NULL generally yields NULL. Change the behavior to follow that. Since the new behavior was previously disallowed, no one should have been relying on it and there is no compatibility problem. Update the tests and note it as a CQL extension.	2023-02-12 17:05:22 +02:00
Avi Kivity	b40dc49e05	cql3: expr: fix search_and_replace() for subscripts We forgot to preserve the subscript's type, so fix that. Also drop a leftover throw. It's dead code, immediately after a return.	2023-02-12 17:05:22 +02:00
Avi Kivity	8dda84bb0c	cql3: expr: fix function evaluation with NULL inputs Function call evaluation rejects NULL inputs, unnecssarily. Functions work well with NULL inputs. Fix by relaxing the check. This currently has no impact because functions are not evaluated via expressions, but via selectors.	2023-02-12 17:05:22 +02:00
Avi Kivity	ecdd49317a	cql3: expr: add LWT IF clause variants of binary operators LWT IF clause interprets equality differently from SQL (and the rest of CQL): it thinks NULL equals NULL. Currently, it implements binary operators all by itself so the fact that oper_t::EQ (and friends) means something else in the rest of the code doesn't bother it. However, we can't unify the code (in column_condition.cc) with the rest of expression evaluation if the meaning changes in different places. To prepare for this, introduce a null_handling_style field to binary_operator that defaults to `sql` but can be changed to `lwt_nulls` to indicate this special semantic. A few unit tests are added. LWT itself still isn't modified.	2023-02-12 17:03:03 +02:00
Alejo Sanchez	8bf2d515de	test/topology: rename topology test file Rename test_topology.py to reflect current tests. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-02-12 12:59:31 +01:00
Alejo Sanchez	11691ba7f5	test/topology: lint and type for topology tests Fix minor lint and type hints. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-02-12 12:59:31 +01:00
Alejo Sanchez	49baf6789c	test/topology: move topology ip tests to own file Move slow topology IP related tests to a separate file. Add docstrings. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-02-12 12:59:19 +01:00
Alejo Sanchez	3fcef63a0f	test/topology: move topology test remove garbaje... group0 members to own file Move slow test for removenode with nodes not present in group0 to a server after a sudden stop to a separate file. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-02-12 12:48:39 +01:00
Nadav Har'El	10ca08e8ac	Merge 'Sequence CDC preimage select with Paxos learn write' from Kamil Braun `paxos_response_handler::learn_decision` was calling `cdc_service::augment_mutation_call` concurrently with `storage_proxy::mutate_internal`. `augment_mutation_call` was selecting rows from the base table in order to create the preimage, while `mutate_internal` was writing rows to the table. It was therefore possible for the preimage to observe the update that it accompanied, which doesn't make any sense, because the preimage is supposed to show the state before the update. Fix this by performing the operations sequentially. We can still perform the CDC mutation write concurrently with the base mutation write. `cdc_with_lwt_test` was sometimes failing in debug mode due to this bug and was marked flaky. Unmark it. Also fix a comment in `cdc_with_lwt_test`. Fixes #12098 Closes #12768 * github.com:scylladb/scylladb: test/cql-pytest: test_cdc: regression test for #12098 test/cql: cdc_with_lwt_test: fix comment service: storage_proxy: sequence CDC preimage select with Paxos learn	2023-02-12 13:28:34 +02:00
Alejo Sanchez	655e1587e3	test/topology: move topology rejoin test to own file Move slow test for rejoining a server after a sudden stop to a separate file. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-02-12 12:02:47 +01:00
Alejo Sanchez	7cc669f5a5	test/topology: merge topology schema tests and... ... move them to their own file. Schema verification tests for restart, add, and hard stop of server can be done with the same cluster. Merge them in the same test case. While there, move them to a separate file to be run independently as this is a slow test. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-02-12 12:02:40 +01:00
Alejo Sanchez	93de79d214	test/topology: isolate topology smp params test Move slow test for different smp parameters to its own file. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-02-12 12:02:32 +01:00
Alejo Sanchez	293550ca5c	test/topology: move topology helpers to common file Move helper functions to a common file ahead of splitting topology tests. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-02-12 12:02:16 +01:00
Nadav Har'El	2653865b34	Merge 'test.py: improve test failure handling' from Kamil Braun Improve logging by printing the cluster at the end of each test. Stop performing operations like attempting queries or dropping keyspaces on dirty clusters. Dirty clusters might be completely dead and these operations would only cause more "errors" to happen after a failed test, making it harder to find the real cause of failure. Mark cluster as dirty when a test that uses it fails - after a failed test, we shouldn't assume that the cluster is in a usable state, so we shouldn't reuse it for another test. Rely on the `is_dirty` flag in `PythonTest`s and `CQLApprovalTest`s, similarly to what `TopologyTest`s do. Closes #12652 * github.com:scylladb/scylladb: test.py: rely on ScyllaCluster.is_dirty flag for recycling clusters test/topology: don't drop random_tables keyspace after a failed test test/pylib: mark cluster as dirty after a failed test test: pylib, topology: don't perform operations after test on a dirty cluster test/pylib: print cluster at the end of test	2023-02-12 12:13:25 +02:00
Kamil Braun	54f85c641d	test/pylib: use larger timeout for decommission/removenode Recently we enabled RBNO by default in all topology operations. This made the operations a bit slower (repair-based topology ops are a bit slower than classic streaming - they do more work), and in debug mode with large number of concurrent tests running, they might timeout. The timeout for bootstrap was already increased before, do the same for decommission/removenode. The previously used timeout was 300 seconds (this is the default used by aiohttp library when it makes HTTP requests), now use the TOPOLOGY_TIMEOUT constant from ScyllaServer which is 1000 seconds.	2023-02-10 15:56:31 +01:00
Kamil Braun	fde6ad5fc0	test/pylib: scylla_cluster: rename START_TIMEOUT to TOPOLOGY_TIMEOUT Use a more generic name since the constant will also be used as timeout for decommission and removenode.	2023-02-10 15:56:31 +01:00
Kamil Braun	ca4db9bb72	Merge 'test/raft: test snapshot threshold' from Alecco Force snapshot with schema changes while server down. Then verify schema when bringing back up the server. Closes #12726 * github.com:scylladb/scylladb: pytest/topology: check snapshot transfer raft conf error injection for snapshot test/pylib: one-shot error injection helper	2023-02-10 15:24:46 +01:00
Kamil Braun	540f6d9b78	test/cql-pytest: test_cdc: regression test for #12098 Perform multiple LWT inserts to different keys ensuring none of them observes a preimage. On my machine this test reproduces the problem more than 50% of the time in debug mode.	2023-02-10 14:35:49 +01:00
Avi Kivity	9696ab7fae	cql3: expr: change evaluate_binop_sides to return more NULL information Currently, evaluate_binop_sides() returns std::nullopt if either side is NULL. Since we wish to to add binary operators that do consider NULL on each side, make evaluate_binop_sides return the original NULLs instead (as managed_bytes_opt). Utimately I think evaluate_binop_sides() should disappear, but before that we have to improve unset value checking.	2023-02-10 09:45:35 +02:00
Botond Dénes	423df263f5	Merge 'Sanitize with_sstable_directory() helper in tests' from Pavel Emelyanov The helping wrapper facilitates the usage of sharded<sstable_directory> for several test cases and the helper and its callers had deserved some cleanup over time. Closes #12791 * github.com:scylladb/scylladb: sstable_directory_test: Reindent and de-multiline sstable_directory_test: Enlighten and rename sstable_from_existing_file sstable_directory_test: Remove constant parallelizm parameter	2023-02-10 07:11:38 +02:00
Tomasz Grabiec	402d5fd7e3	cache: Fix empty partition entries being left in cache in some cases Merging rows from different partition versions should preserve the LRU link of the entry from the newer version. We need this in case we're merging two last dummy entries where the older dummy is already unlinked from the LRU. The newer dummy could be the last entry which is still holding the partition entry linked in the LRU. The mutation_partition_v2 merging didn't take the LRU link from the newer entry, and we could end up with the partition entry not having any entries linked in the LRU. Introduced in `f73e2c992f`. Fixes #12778 Closes #12785	2023-02-09 23:03:23 +02:00
Kamil Braun	e2064f4762	Merge 'repair: finish repair immediately on local keyspaces' from Aleksandra Martyniuk System keyspace is a keyspace with local replication strategy and thus it does not need to be repaired. It is possible to invoke repair of this keyspace through the api, which leads to runtime error since peer_events and scylla_table_schema_history have different sharding logic. For keyspaces with local replication strategy repair_service::do_repair_start returns immediately. Closes #12459 * github.com:scylladb/scylladb: test: rest_api: check if repair of system keyspace returns before corresponding task is created repair: finish repair immediately on local keyspaces	2023-02-09 18:44:37 +01:00
Pavel Emelyanov	52e2ad051e	sstable_utils: Move the test_setup to perf/ The sstable perf test uses test_setup ability to create temporary directory and clean it and that's the only place that uses it. Move the remainders of test_setup into perf/ so that no unit tests attempt to re-use it (there's test_env for that). Remove unused _walker and _create_directory while at it. Mark protected stuff private while at it as well. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-09 17:18:04 +03:00
Pavel Emelyanov	868391a613	sstable_utils: Remove unused wrappers over test_env Now all callers are using the test_env directly Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-09 17:17:48 +03:00
Pavel Emelyanov	47022bf750	sstable_test: Open-code do_with_cloned_tmp_directory The statistics_rewrite case uses the helper that creates a copy of the provided static directory, but it's the only user of this helper. It's better to open-code it into the test case. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-09 17:17:48 +03:00
Pavel Emelyanov	19c1afb20a	sstable_test: Asynchronize statistics_rewrite case It is ran inside async context and can be coded in a shorter form without using deeply nested then-s Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-09 17:17:23 +03:00
Pavel Emelyanov	85b8bae035	tests: Replace test_setup::do_with_tmp_directory with test_env::do_with(_async)? The former helper is just a wrapper over the _async version of the latter and also creates a tempdir and calls the fn with tempdir as an argument. The test_env already has its own temp dir on board, so callers can can be switched to using it. Some test cases use the do_with_tmp_directory but generate chain of futures without in fact using the async context. This patch addresses that, so the change is not 100% mechanical unfortunately. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-09 17:11:31 +03:00
Anna Stuchlik	9f2724231c	doc: add the new KB to the list of topics	2023-02-09 14:42:09 +01:00
Anna Stuchlik	cfdb8a8760	doc: add a new KB article about timbstone garbage collection in ICS	2023-02-09 14:36:06 +01:00
Pavel Emelyanov	f0212c7b68	sstable_directory_test: Reindent and de-multiline Many tests using sstable directory wrapper have broken indentation with previous patching. Fix it. No functional changes. Also, while at it, convert multiline wrapper calls into one-line, after previous patch these are short enough for that. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-09 16:00:53 +03:00
Pavel Emelyanov	ec02b0f706	sstable_directory_test: Enlighten and rename sstable_from_existing_file It used to be the sstable maker for sstable::test_env / cql_test_env, now sstables for tests are made via sstables manager explicitly, so the guy can be remaned to something more relevant to its current status. Also, de-mark its constructors as explicit to make callers look shorter. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-09 15:59:23 +03:00
Pavel Emelyanov	c843f7937b	sstable_directory_test: Remove constant parallelizm parameter It's 1 (one) all the time, just hard-code it internally Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-09 15:59:01 +03:00
Avi Kivity	fd4ee4878a	Revert "storage_service: Enable Repair Based Node Operations (RBNO) by default for all node ops" This reverts commit `e7d5e508bc`. It ends up failing continuous integration tests randomly. We don't know if it's uncovering an existing bug, or if RBNO itself is broken, but for now we need to revert it to unblock progress.	2023-02-09 10:30:26 +02:00
Botond Dénes	b62d84fdba	Merge 'Keep reshape and reshard logic in distributed loader' from Pavel Emelyanov Now it's scattered between dist. loader and sstable directory code making the latter quite bloated. Keeping everything in distributed loader makes the sstable_directory code compact and easier to patch to support object storage backend. Closes #12771 * github.com:scylladb/scylladb: sstable_directory: Rename remove_input_sstables_from_reshaping() sstable_directory: Make use of remove_sstables() helper sstable_directory: Merge output sstables collecting methods distributed_loader: Remove max_compaction_threshold argument from reshard() distributed_loader: Remove compaction_manager& argument from reshard() sstable_directory: Move the .reshard() to distributed_loader sstable_directory: Add helper to load foreign sstable sstable_directory: Add io-prio argument to .reshard() sstable_directory: Move reshard() to distributed_loader.cc distributed_loader: Remove compaction_manager& argument from reshape() sstable_directory: Move the .reshape() to distributed loader sstable_directory: Add helper to retrive local sstables sstable_directory: Add io-prio argument to .reshape() sstable_directory: Move reshape() to distributed_loader.cc	2023-02-09 10:01:44 +02:00
Botond Dénes	1c333e2102	Merge 'Transport server error handling fixes' from Gusev Petr CQL transport sever error handling fixes and improvements: * log failed requests with `DEBUG` level for easier debugging; * in case of unhandled errors, deliver them to the client as `SERVER_ERROR`'s * fix for `protocol_error`'s in case of shedded big requests; * explicit tests have been written for the error handling problems above. Closes #11949 * github.com:scylladb/scylladb: transport server: fix "request size too large" handling transport server: log failed requests with debug level transport server: fix unexpected server errors handling transport server: log client errors with debug level	2023-02-09 09:02:22 +02:00
Anna Stuchlik	c7778dd30b	doc: related https://github.com/scylladb/scylladb/issues/12754 , add the requirement to upgrade Monitoring to version 4.3 Closes #12784	2023-02-09 07:10:34 +02:00
Botond Dénes	746b009db0	Merge 'dist/debian: bump up debhelper compatibility level to 10 and cleanups' from Kefu Chai - dist/debian: bump up debhelper compatibility level to 10 - dist/debian: drop unused Makefile variable Closes #12723 * github.com:scylladb/scylladb: dist/debian: drop unused Makefile variable dist/debian: bump up debhelper compatibility level to 10	2023-02-09 07:04:20 +02:00
Pavel Emelyanov	40de737b36	sstable_directory: Rename remove_input_sstables_from_reshaping() It unlinks unshared sstables filtering some of them out. Name it according to what it does without mentioning reshape/reshard. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-08 15:00:44 +03:00
Pavel Emelyanov	a1dc251214	sstable_directory: Make use of remove_sstables() helper Currently it's called remove_input_sstables_from_resharding() but it's just unlinks sstables in parallel from the given list. So rename it not to mention reshard and also make use of this "new" helper in the remove_input_sstables_from_reshaping(), it needs exactly the same functionality. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-08 15:00:44 +03:00
Pavel Emelyanov	cb36f5e581	sstable_directory: Merge output sstables collecting methods There are two of them collecting sstables from resharding and reshaping. Both doing the same job except for the latter doesn't expect the list to contain remote sstables. This patch merges them together with the help of an extra sanity boolean to check for the remote sstable not in the list. And renames the method not to mention reshape/reshard. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-08 15:00:41 +03:00
Avi Kivity	0f15ff740d	cql3: expr: simplify user/debug formatting We have a cql3::expr::expression::printer wrapper that annotates an expression with a debug_mode boolean prior to formatting. The fmt library, however, provides a much simpler alterantive: a custom format specifier. With this, we can write format("{:user}", expr) for user-oriented prints, or format("{:debug}", expr) for debug-oriented prints (if nothing is specified, the default remains debug). This is done by implementing fmt::formatter::parse() for the expression type, can using expression::printer internally. Since sometimes we pass expression element types rather than the expression variant, we also provide a custom formatter for all ExpressionElement Types. Uses for expression::printer are updated to use the nicer syntax. In one place we eliminate a temporary that is no longer needed since ExpressionElement:s can be formatted directly. Closes #12702	2023-02-08 12:24:58 +02:00
Petr Gusev	3263523b54	transport server: fix "request size too large" handling Calling _read_buf.close() doesn't imply eof(), some data may have already been read into kernel or client buffers and will be returned next time read() is called. When the _server._max_request_size limit was exceeded and the _read_buf was closed, the process_request method finished and we started processing the next request in connection::process. The unread data from _read_buf was treated as the header of the next request frame, resulting in "Invalid or unsupported protocol version" error. The existing test_shed_too_large_request was adjusted. It was originally written with the assumption that the data of a large query would simply be dropped from the socket and the connection could be used to handle the next requests. This behaviour was changed in scylladb#8800, now the connection is closed on the Scylla side and can no longer be used. To check there are no errors in this case, we use Scylla metrics, getting them from the Scylla Prometheus API.	2023-02-08 00:07:08 +04:00
Petr Gusev	0904f98ebf	transport server: log failed requests with debug level These logs can be helpful for debugging, e.g. if an error was not handled correctly by the client driver, or another error occurred while handling it.	2023-02-08 00:07:08 +04:00
Petr Gusev	a4cf509c3d	transport server: fix unexpected server errors handling If request processing ended with an error, it is worth sending the error to the client through make_error/write_response. Previously in this case we just wrote a message to the log and didn't handle the client connection in any way. As a result, the only thing the client got in this case was timeout error. A new test_batch_with_error is added. It is quite difficult to reproduce error condition in a test, so we use error injection instead. Passing injection_key in the body of the request ensures that the exception will be thrown only for this test request and will not affect other requests that the driver may send in the background. Closes: scylladb#12104	2023-02-08 00:07:02 +04:00
Pavel Emelyanov	73d458cf89	distributed_loader: Remove max_compaction_threshold argument from reshard() Since the whole reshard() is local to dist. loader code now, the caller of the reshard helper may let this method get the threshold itself Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-07 19:31:43 +03:00
Pavel Emelyanov	25aaa45256	distributed_loader: Remove compaction_manager& argument from reshard() It can be obtained from the table& Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-07 19:31:43 +03:00
Pavel Emelyanov	15547f1b5b	sstable_directory: Move the .reshard() to distributed_loader Now all the reshading logic is accumulated in distributed loader and the sstable_directory is just the place where sstables are collected. The changes summary is: - add sstable_directory as argument (used to be "this") - replace all "this" captures with &dir ones - remove temporary namespace gap and declaration from sst-dir class Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-07 19:31:43 +03:00
Pavel Emelyanov	ab5f48d496	sstable_directory: Add helper to load foreign sstable This is to generalize the code duplication between .reshard() and existing .load_foreign_sstables() (plural form). Make it coroutinized right at once. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-07 19:31:43 +03:00
Pavel Emelyanov	e6e65c87d5	sstable_directory: Add io-prio argument to .reshard() Now it gets one from this-> but the method is becoming static one in distributed_loader which only has it as an argument. That's not big deal as the current IO class is going to be derived from current sched group, so this extra arg will go away at all some day. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-07 19:31:41 +03:00
Pavel Emelyanov	a32d2b6d6a	sstable_directory: Move reshard() to distributed_loader.cc Just move the code and create temporary namespace gap for that Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-07 19:31:12 +03:00
Pavel Emelyanov	1de8c85acd	distributed_loader: Remove compaction_manager& argument from reshape() It can be obtained from the table& Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-07 19:31:12 +03:00
Pavel Emelyanov	d734b6b7c1	sstable_directory: Move the .reshape() to distributed loader Now all the reshaping logic is accumulated in distributed loader and the sstable_directory is just the place where sstables are collected. The changes summary is: - add sstable_directory as argument (used to be "this") - replace all "this" captures with &dir ones - remove temporary namespace gap and declaration from sst-dir class Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-07 19:30:55 +03:00
Pavel Emelyanov	b906d34807	sstable_directory: Add helper to retrive local sstables There are methods to retrive shared local sstables and foreign sstables, so here's one more to the family Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-07 19:23:40 +03:00
Pavel Emelyanov	420fc8d4df	sstable_directory: Add io-prio argument to .reshape() Now it gets one from this-> but the method is becoming static one in distributed_loader which only has it as an argument. That's not big deal as the current IO class is going to be derived from current sched group, so this extra arg will go away at all some day. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-07 19:22:27 +03:00
Pavel Emelyanov	a70d6017f8	sstable_directory: Move reshape() to distributed_loader.cc Just move the code and create temporary namespace gap for that Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-07 19:21:54 +03:00
Kamil Braun	97b2971bf1	test/cql: cdc_with_lwt_test: fix comment The comment mentioned an entry that shouldn't be there (and it wasn't in the actual expected result).	2023-02-07 16:12:18 +01:00
Kamil Braun	1ef113691a	service: storage_proxy: sequence CDC preimage select with Paxos learn `paxos_response_handler::learn_decision` was calling `cdc_service::augment_mutation_call` concurrently with `storage_proxy::mutate_internal`. `augment_mutation_call` was selecting rows from the base table in order to create the preimage, while `mutate_internal` was writing rows to the table. It was therefore possible for the preimage to observe the update that it accompanied, which doesn't make any sense, because the preimage is supposed to show the state before the update. Fix this by performing the operations sequentially. We can still perform the CDC mutation write concurrently with the base mutation write. `cdc_with_lwt_test` was sometimes failing in debug mode due to this bug and was marked flaky. Unmark it. Fixes #12098	2023-02-07 16:12:18 +01:00
Alejo Sanchez	cf3b8d7edc	pytest/topology: check snapshot transfer Test snapshot transfer by reducing the snapshot threshold on initial servers (3 and 1 trailing). Then creates a table, and does 3 extra schema changes (add column), triggering at least 2 snapshots. Then brings a new server to the cluster, which will get the schema through a snapshot. Then the test stops the initial servers and verifies the table schema is up to date on the new server. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-02-07 16:09:07 +01:00
Petr Gusev	95bf8eebe0	query_ranges_to_vnodes_generator: fix for exclusive boundaries Let the initial range passed to query_partition_key_range be [1, 2) where 2 is the successor of 1 in terms of ring_position order and 1 is equal to vnode. Then query_ranges_to_vnodes_generator() -> [[1, 1], (1, 2)], so we get an empty range (1,2) and subsequently will make a data request with this empty range in storage_proxy::query_partition_key_range_concurrent, which will be redundant. The patch adds a check for this condition after making a split in the main loop in process_one_range. The patch does not attempt to handle cases where the original ranges were empty, since this check is the responsibility of the caller. We only take care not to add empty ranges to the result as an unintentional artifact of the algorithm in query_ranges_to_vnodes_generator. A test case is added in test_get_restricted_ranges. The helper lambda check is changed so that not to limit the number of ranges to the length of expected ranges, otherwise this check passes without the change in process_one_range. Fixes: #12566 Closes #12755	2023-02-07 16:02:31 +02:00
Kefu Chai	afd1221b53	commitlog: mark request_controller_timeout_exception_factory::timeout() noexcept request_controller_timeout_exception_factory::timeout() creates an instance of `request_controller_timed_out_error` whose ctor is default-created by compiler from that of timed_out_error, which is in turn default-created from the one of `std::exception`. and `std::exception::exception` does not throw. so it's safe to mark this factory method `noexcept`. with this specifier, we don't need to worry about the exception thrown by it, and don't need to handle them if any in `seastar::semaphore`, where `timeout()` is called for the customized exception. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12759	2023-02-07 14:38:54 +02:00
Botond Dénes	051da4e148	Merge 'Handle EDQUOT error just like ENOSPC' from Kefu Chai - main: consider EDQUOT as environmental failure also - main: use defer_verbose_shutdown() to shutdown compaction manager - replica/table: extract should_retry() int with_retry - replica/table: retry on EDQUOT when flushing memtable Fixes #12626 Closes #12653 * github.com:scylladb/scylladb: replica/table: retry on EDQUOT when flushing memtable replica/table: extract should_retry() int with_retry main: use defer_verbose_shutdown() to shutdown compaction manager main: consider EDQUOT as environmental failure also	2023-02-07 14:38:36 +02:00
David Garcia	734f09aba7	docs: add flags support in mulitversion Closes #12740	2023-02-07 14:23:53 +02:00
Wojciech Mitros	02bfac0c66	uda: change the UDF used in a UDA if it's replaced Currently, if a UDA uses a UDF that's being replaced, the UDA will still keep using the old UDF until the node is restarted. This patch fixes this behavior by checking all UDAs when replacing a UDF and updating them if necessary. Fixes #12709	2023-02-07 12:17:52 +01:00
Nadav Har'El	3ba011c2be	cql: fix empty aggregation, and add more tests This patch fixes #12475, where an aggregation (e.g., COUNT(*), MIN(v)) of absolutely no partitions (e.g., "WHERE p = null" or "WHERE p in ()") resulted in an internal error instead of the "zero" result that each aggregator expects (e.g., 0 for COUNT, null for MIN). The problem is that normally our aggregator forwarder picks the nodes which hold the relevant partition(s), forwards the request to each of them, and then combines these results. When there are no partitions, the query is sent to no node, and we end up with an empty result set instead of the "zero" results. So in this patch we recognize this case and build those "zero" results (as mentioned above, these aren't always 0 and depend on the aggregation function!). The patch also adds two tests reproducing this issue in a fairly general way (e.g., several aggregators, different aggregation functions) and confirming the patch fixes the bug. The test also includes two additional tests for COUNT aggregation, which uncovered an incompatibility with Cassandra which is still not fixed - so these tests are marked "xfail": Refs #12477: Combining COUNT with GROUP by results with empty results in Cassandra, and one result with empty count in Scylla. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12715	2023-02-07 12:28:42 +02:00
Botond Dénes	bf7113f6dc	Merge 'locator: token_metadata: improve get_address_ranges()' from Michał Chojnowski This two-patch series aims to improve `get_address_ranges()` by eliminating cases of quadratic behavior which were noticed to cause huge allocations, and by deduplicating the code of `get_address_ranges()` with the almost-identical `get_ranges()`. Refs https://github.com/scylladb/scylladb/issues/10337 Refs https://github.com/scylladb/scylladb/issues/10817 Refs https://github.com/scylladb/scylladb/issues/10836 Refs https://github.com/scylladb/scylladb/issues/10837 Fixes https://github.com/scylladb/scylladb/issues/12724 Closes #12733 * github.com:scylladb/scylladb: locator: token_metadata: unify get_address_ranges() and get_ranges() locator: token_metadata: get rid of a quadratic behaviour in get_address_ranges()	2023-02-07 12:28:41 +02:00
Botond Dénes	a01662b287	Merge 'doc: improve the general upgrade policy' from Anna Stuchlik Related: https://github.com/scylladb/scylladb/pull/12586 This PR improves the upgrade policy added with https://github.com/scylladb/scylladb/pull/12586, according to the feedback from: @tzach > Upgrading from 4.6 to 5.0 is not clear; better to use 4.x to 4.y versions as an example. and @bhalevy > It is not completely clear that when upgrading through several versions, the whole cluster needs to be upgraded to each consecutive version, not just the rolling node. In addition, the content is organized into sections for the sake of readability. Closes #12647 * github.com:scylladb/scylladb: doc: add the information abou patch releases doc: add the info about the minor versions doc: reorganize the content on the Upgrade ScyllaDB page doc: improve the overview of the upgrade procedure (apply feedback)	2023-02-07 12:28:41 +02:00
Nadav Har'El	c00fcc80e5	test/cql-pytest: three tests for empty clustering keys This patch adds three additional tests for empty (e.g., empty string) clustering keys. The first test disproves a worry that was raised in #12561 that perhaps empty clustering keys only seem work, but they don't get written to sstables. The new test verifies that there is no bug - they are written and can be read correctly. The second and third test reproduce issue #12749 - an empty clustering should be allowed in a COMPACT STORAGE table only if there is a compound (multi-column) clustering key. But as the tests demonstrate, 1. if there is just one clustering column, Scylla gives the wrong error message, and 2. if there is a compound clustering key, Scylla doesn't allow an empty key as it should. As usual, all tests pass on Cassandra. The last two tests fail on Scylla, so are marked xfail. Refs #12561 Refs #12749 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12750	2023-02-07 12:28:41 +02:00
Petr Gusev	bd80a449d5	transport server: log client errors with debug level Ideally, these errors should be transparently delivered to the client, but in practice, due to various flaws/bugs in scylla and/or the driver, they can be lost, which enormously complicates troubleshooting. const socket_address& get_remote_address() is needed for its convenient conversion to string, which includes ip and port.	2023-02-07 13:53:38 +04:00
Wojciech Mitros	58987215dc	functions: add helper same_signature method When deciding whether two functions have the same signature, we have to check if they have the same name and parameter types. Additionally, if they're represented by pointers, we need to check if any of them is a nullptr. This logic is used multiple times, so it's extracted to a separate function. To use this function, the `used_by_user_aggregate` method takes now a function instead of name and types list - we can do it because we always use it with an existing user function (that we're trying to drop). The method will also be useful when we'll be not dropping, but replacing a user function.	2023-02-07 10:15:12 +01:00
Wojciech Mitros	20069372e7	uda: return aggregate functions as shared pointers We will want to reuse the functions that we get from an aggregate without making a deep copy, and it's only possible if we get pointers from the aggregate instead of actual values.	2023-02-07 10:15:09 +01:00
Kefu Chai	bba03c1a55	replica/table: retry on EDQUOT when flushing memtable retry when memtable flush fails due to EDQUOT. there are chances that user exceeds the disk quota when scylla flushes memtable and user manages to free up the necessary resource before the next retry. before this change, we simply `abort()` in this case. after this change, we just keep on retrying until the service is shutdown. Fixes #12626 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-07 16:00:40 +08:00
Kefu Chai	6d017e75e0	replica/table: extract should_retry() int with_retry * extract a lambda encapsulating the condition if we should retry at seeing an exception when calling functions with `with_retry()`. we apply the same check to the exception raised when performing table related i/o operations. in this change, the two checks are consolidated and extracted into a single lambda, so we can add yet more error code (s) which should be considered retry-able failures. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-07 16:00:40 +08:00
Kefu Chai	d4315245a1	main: use defer_verbose_shutdown() to shutdown compaction manager * use `defer_verbose_shutdown()` to shutdown compaction manager `EDQUOT` is quite similar as `ENOSPC`, in the sense that both of them are caused by environmental issues. before this change, `compaction_manager` filters the ENOSPC exceptions thrown by `compaction_manager::really_do_stop()`, so they are not propagated to caller when calling `compaction_manager::stop()` -- only a warning message is printed in the log. but `EDQUOT` is not handled. after this change, the exception raised by compaction manager's stop process is not filtered anymore and is handled by `defer_verbose_shutdown()` instead, which is able to check the type of exception, and print out error message in the log. so the `ENOSPC` and `EDQUOT` errors are taken care of, and more visible from user's perspective as they are printed as errors instead of warning. but they are not printed using the `compaction_manager` logger anymore. so if our testing or user's workflow depends on this behavior, the related setting should be updated accordingly. Fixes #12626 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-07 16:00:40 +08:00
Kefu Chai	c3ef353e3d	main: consider EDQUOT as environmental failure also EDQUOT can be returned as the errno when the underlying filesystem is trying to reserve necessary resources from disk for performing i/o on behalf of the effective user, and the filesystem fails to acquire the necessary resources. it could be inode, volume space, or whatever resources for completing the i/o operation. but none of them is the consequence of scylla's fault. so we should not `abort()` at seeing this errno. instead, it's should be reported to the administrator. in this change, EDQUOT is also considered as an environmental failure just like EIO, EACCES and ENOSPC. they could be thrown when stopping an server. Fixes #12626 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-07 16:00:40 +08:00
Tomasz Grabiec	ccc8e47db1	Merge 'test/lib: introduce key_utils.hh' from Botond Dénes We currently have two method families to generate partition keys: * make_keys() in test/lib/simple_schema.hh * token_generation_for_shard() in test/lib/sstable_utils.hh Both work only for schemas with a single partition key column of `text` type and both generate keys of fixed size. This is very restrictive and simplistic. Tests, which wanted anything more complicated than that had to rely on open-coded key generation. Also, many tests started to rely on the simplistic nature of these keys, in particular two tests started failing because the new key generation method generated keys of varying size: * sstable_compaction_test.sstable_run_based_compaction_test * sstable_mutation_test.test_key_count_estimation These two tests seems to depend on generated keys all being of the same size. This makes some sense in the case of the key count estimation test, but makes no sense at all to me in the case of the sstable run test. Closes #12657 * github.com:scylladb/scylladb: test/lib/sstable_utils: remove now unused token_generation_for_shard() and friends test/lib/simple_schema: remove now unused make_keys() and friends test: migrate to tests::generate_partition_key[s]() test/lib/test_services: add table_for_tests::make_default_schema() test/lib: add key_utils.hh test/lib/random_schema.hh: value_generator: add min_size_in_bytes	2023-02-06 18:11:32 +01:00
Nadav Har'El	cc207a9f44	Merge 'uda: improve checking whether UDFs are used in UDAs in DROP statements' from Wojciech Mitros This patch fixes 2 issues with checking whether UDFs are used in UDAs: 1) UDFs types are not considered during the check, which prevents us from dropping a UDF that isn't used in any UDAs, but shares its name with one of them. 2) the REDUCEFUNC is not considered during the check, which allows dropping a UDF even though it's used in a UDA as the REDUCEFUNC. Additionally, tests for these issues are added Closes #12681 * github.com:scylladb/scylladb: udf: also check reducefunc to confirm that a UDF is not used in a UDA udf: fix dropping UDFs that share names with other UDFs used in UDAs pytest: add optional argument for new_function argument types	2023-02-06 19:07:26 +02:00
Kamil Braun	56c4d246ef	Merge 'Introduce recent_entries_map datatype to track least recent visited entries.' from Andrii Patsula Fixes: https://github.com/scylladb/scylladb/issues/12309 Closes #12720 * github.com:scylladb/scylladb: service/raft: raft_group_registry: use recent_entries_map to store rate_limits in pinger. Fixes #12309 utils: introduce recent_entries_map datatype to track least recent visited entries.	2023-02-06 18:01:26 +01:00
Botond Dénes	a3b280ba8c	Merge 'doc: document the workaround to install a non-latest ScyllaDB version' from Anna Stuchlik This PR is related to https://github.com/scylladb/scylla-enterprise/issues/2176. It adds a FAQ about a workaround to install a ScyllaDB version that is not the most recent patch version. In addition, the link to that FAQ is added to the patch upgrade guides 2021 and 2022 . Closes #12660 * github.com:scylladb/scylladb: doc: add the missing sudo command doc: replace the reduntant link with an alternative way to install a non-latest version doc: add the link to the FAQ about pinning to the patch upgrade guides 2022 and 2022 doc: add a FAQ with a workaround to install a non-latest ScyllaDB version on Debian and Ubuntu	2023-02-06 17:00:39 +02:00
Kefu Chai	d0a2440023	docs: bump sphinx-sitemap to 2.5.0 `poetry install` consistently times out when resolving the dependencies. like: ``` Command ['/home/kefu/.cache/pypoetry/virtualenvs/scylla-1fWQLpOv-py3.9/bin/python', '-m', 'pip', 'install', '--use-pep517', '--disable-pip-version-check', '--isolated', '--no-input', '--prefix', '/home/kefu/.cache/pypoetry/virtualenvs /scylla-1fWQLpOv-py3.9', '--upgrade', '--no-deps', '/home/kefu/.cache/pypoetry/artifacts/e6/ad/ab/eca9f61c5b15fd05df7192c0e5914a9e5ac927744b1fb5f6c07a92d7a4/sphinx-sitemap-2.2.0.tar.gz'] errored with the following return code 1, and out put: Processing /home/kefu/.cache/pypoetry/artifacts/e6/ad/ab/eca9f61c5b15fd05df7192c0e5914a9e5ac927744b1fb5f6c07a92d7a4/sphinx-sitemap-2.2.0.tar.gz Installing build dependencies: started Installing build dependencies: finished with status 'error' ERROR: Command errored out with exit status 2: command: /home/kefu/.cache/pypoetry/virtualenvs/scylla-1fWQLpOv-py3.9/bin/python /tmp/pip-standalone-pip-z97s216j/__env_pip__.zip/pip install --ignore-installed --no-user --prefix /tmp/pip-build-env-37k3lwqd/overlay --no-warn-scrip t-location --no-binary :none: --only-binary :none: -i https://pypi.org/simple -- 'setuptools>=40.8.0' wheel cwd: None Complete output (80 lines): Collecting setuptools>=40.8.0 Downloading setuptools-67.1.0-py3-none-any.whl (1.1 MB) ERROR: Exception: Traceback (most recent call last): File "/tmp/pip-standalone-pip-z97s216j/__env_pip__.zip/pip/_vendor/urllib3/response.py", line 438, in _error_catcher yield File "/tmp/pip-standalone-pip-z97s216j/__env_pip__.zip/pip/_vendor/urllib3/response.py", line 519, in read data = self._fp.read(amt) if not fp_closed else b"" File "/tmp/pip-standalone-pip-z97s216j/__env_pip__.zip/pip/_vendor/cachecontrol/filewrapper.py", line 62, in read data = self.__fp.read(amt) File "/usr/lib64/python3.9/http/client.py", line 463, in read n = self.readinto(b) File "/usr/lib64/python3.9/http/client.py", line 507, in readinto n = self.fp.readinto(b) File "/usr/lib64/python3.9/socket.py", line 704, in readinto return self._sock.recv_into(b) File "/usr/lib64/python3.9/ssl.py", line 1242, in recv_into return self.read(nbytes, buffer) File "/usr/lib64/python3.9/ssl.py", line 1100, in read return self._sslobj.read(len, buffer) socket.timeout: The read operation timed out ``` while sphinx-sitemap 2.5.0 installs without problems. sphinx-sitemap 2.50 is the latest version published to pypi. according to sphinx-sitemap's changelog at https://github.com/jdillard/sphinx-sitemap/blob/master/CHANGELOG.rst , no breaking changes were introduced in between 2.2.0 and 2.5.0. after bumping sphinx-sitemap 2.5.0, following commands can complete without errors: ``` poetry lock make preview ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12705	2023-02-06 15:50:48 +02:00
Anna Stuchlik	c772563cb8	doc: add the information abou patch releases	2023-02-06 14:47:39 +01:00
Botond Dénes	cb2a129371	Merge 'Fix inefficiency when rebuilding table statistics with compaction groups' from Raphael "Raph" Carvalho [table: Fix disk-space related metrics](`529a1239a9`) fixes the table's disk space related metrics. whereas second patch fixes an inefficiency when computing statistics which can be triggered with multiple compaction groups. Closes #12718 * github.com:scylladb/scylladb: table: Fix inefficiency when rebuilding statistics with compaction groups table: Fix disk-space related metrics	2023-02-06 15:11:48 +02:00
Avi Kivity	6bc5536bd8	Revert "Update seastar submodule" This reverts commit `b4559a6992`. It breaks some raft tests. Fixes #12741.	2023-02-06 14:56:44 +02:00
Botond Dénes	5a9f75aac6	Update tools/java submodule * tools/java 1c4e1e7a7d...f0bab7af66 (1): > Fix port option in SSTableLoader	2023-02-06 14:18:52 +02:00
Wojciech Mitros	ef1dac813b	udf: also check reducefunc to confirm that a UDF is not used in a UDA When dropping a UDF we're checking if it's not begin used in any UDAs and fail otherwise. However, we're only checking its state function and final function, and it may also be used as its reduce function. This patch adds the missing checks and a test for them.	2023-02-06 13:02:54 +01:00
Wojciech Mitros	49077dd144	udf: fix dropping UDFs that share names with other UDFs used in UDAs Currently, when dropping a function, we only check if there exist an aggregate that uses a function with the same name as its state function or final function. This may cause the drop to fail even when it's just another UDF with the same name that's used in the aggregate, even when the actual dropped function is not used there. This patch fixes this by checking whether not only the name of the UDA's sfunc and finalfunc, but also their argument types.	2023-02-06 13:02:53 +01:00
Wojciech Mitros	8791b0faf5	pytest: add optional argument for new_function argument types When multiple functions with the same name but different argument types are created, the default drop statement for these functions will fail because it does not include the argument types. With this change, this problem can be worked around by specifying argument types when creating the function, as this will cause the drop statement to include them.	2023-02-06 13:02:19 +01:00
Botond Dénes	8efa9b0904	Merge 'Avoid qctx from view-builder methods of system_keyspace' from Pavel Emelyanov The system_keyspace defines several auxiliary methods to help view_builder update system.scylla_views_builds_in_progress and system.built_views tables. All use global qctx thing. It only takes adding view_builder -> system_keyspace dependency in order to de-static all those helpers and let them use query-processor from it, not the qctx. Closes #12728 * github.com:scylladb/scylladb: system_keysace: De-static calls that update view-building tables storage_service: Coroutinize mark_existing_views_as_built() api: Unset column_famliy endpoints api: Carry sharded<db::system_keyspace> reference over view_builder: Add system_keyspace dependency	2023-02-06 12:44:40 +02:00
Botond Dénes	e247e15ec1	Merge 'Method to create and start task manager task' from Aleksandra Martyniuk In most cases, tasks manager's tasks are started just after they are created. Thus, to reduce boilerplate required for creating and starting tasks, tasks::task_manager::module::make_and_start_task method is added. Repair tasks are modified to use the method where possible. Closes #12729 * github.com:scylladb/scylladb: repair: use tasks::task_manager::module::make_and_start_task for repair tasks tasks: add task_manager::module::make_and_start_task method	2023-02-06 12:38:35 +02:00
Yaniv Kaul	9039b94790	docs: dev - how to test your tests documentation Short paragraph on how to develop tests and ensure they are solid. Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com> Closes #12746	2023-02-06 12:07:43 +02:00
Avi Kivity	1e6cc9ca61	Merge 'storage_service: Enable Repair Based Node Operations (RBNO) by default for all node ops' from Asias He Since `97bb2e47ff` (storage_service: Enable Repair Based Node Operations (RBNO) by default for replace), RBNO was enabled by default for replace ops. After more testing, we decided to enable repair based node operations by default for all node operations. Closes #12173 * github.com:scylladb/scylladb: storage_service: Enable Repair Based Node Operations (RBNO) by default for all node ops test: Increase START_TIMEOUT test: Increase max-networking-io-control-blocks storage_service: Check node has left in node_ops_cmd::decommission_done repair: Use remote dc neighbors for everywhere strategy	2023-02-06 10:42:52 +02:00
Botond Dénes	511c0123a2	Merge 'Add compaction module to task manager' from Aleksandra Martyniuk Introduces task manager's compaction module. That's an initial part of integration of compaction with task manager. When fully integrated, task manager will allow user to track compaction operations, check status and progress of each individual one. It will help with creating an asynchronous version of rest api that forces any compaction. Currently, users can see with /task_manager/list_modules api call that compaction is one of the modules accessible through task manager. They won't get any additional information though, since compaction tasks are not created yet. A shared_ptr to compaction module is kept in compaction manager. Closes #12635 * github.com:scylladb/scylladb: compaction: test: pass task_manager to compaction_manager in test environment compaction: create and register task manager's module for compaction tasks: add task_manager constructor without arguments	2023-02-06 09:25:05 +02:00
Botond Dénes	cdd8b0fa35	Merge 'SSTable set improvements' from Raphael "Raph" Carvalho Makes sstable_set::all() interface robust, and introduces sstable_set::size() to avoid copies when retrieving set size. Closes #12716 * github.com:scylladb/scylladb: treewide: Use new sstable_set::size() wherever possible sstables: Introduce sstable_set::size() sstables: Fix fragility of sstable_set::all() interface	2023-02-06 08:24:00 +02:00
Avi Kivity	f73e2c992f	Merge 'Keep range tombstones with rows in memtables and cache' from Tomasz Grabiec This series switches memtable and cache to use a new representation for mutation data, called `mutation_partition_v2`. In this representation, range tombstone information is stored in the same tree as rows, attached to row entries. Each entry has a new tombstone field, which represents range tombstone part which applies to the interval between this entry and the previous one. See docs/dev/mvcc.md for more details about the format. The transient mutation object still uses the old model in order to avoid work needed to adapt old code to the new model. It may also be a good idea to live with two models, since the transient mutation has different requirements and thus different trade-offs can be made. Transient mutation doesn't need to support eviction and strong exception guarantees, so its algorithms and in-memory representation can be simpler. This allows us to incrementally evict range tombstone information. Before this series, range tombstones were accumulated and evicted only when the whole partition entry was evicted. This could lead to inefficient use of cache memory. Another advantage of the new representation is that reads don't have to lookup range tombstone information in a different tree while reading. This leads to simpler and more efficient readers. There are several disadvantages too. Firstly, rows_entry is now larger by 16 bytes. Secondly, update algorithms are more complex because they need to deoverlap range tombstone information. Also, to handle preemption and provide strong exception guarantees, update algorithms may need to allocate sentinel entries, which adds complexity and reduces performance. The memtable reader was changed to use the same cursor implementation which cache uses, for improved code reuse and reducing risk of bugs due to discrepancy of algorithms which deal with MVCC. Remaining work: - performance optimizations to apply_monotonically() to avoid regressions - performance testing - preemption support in apply_to_incomplete (cache update from memtable) Fixes #2578 Fixes #3288 Fixes #10587 Closes #12048 * github.com:scylladb/scylladb: test: mvcc: Extend some scenarios with exhaustive consistency checks on eviction test: mvcc: Extract mvcc_container::allocate_in_region() row_cache, lru: Introduce evict_shallow() test: mvcc: Avoid copies of mutation under failure injection test: mvcc: Add missing logalloc::reclaim_lock to test_apply_is_atomic mutation_partition_v2: Avoid full scan when applying mutation to non-evictable Pass is_evictable to apply() tests: mutation_partition_v2: Introduce test_external_memory_usage_v2 mirroring the test for v1 tests: mutation: Fix test_external_memory_usage() to not measure mutation object footprint tests: mutation_partition_v2: Add test for exception safety of mutation merging tests: Add tests for the mutation_partition_v2 model mutation_partition_v2: Implement compact() cache_tracker: Extract insert(mutation_partition_v2&) mvcc, mutation_partition: Document guarantees in case merging succeeds mutation_partition_v2: Accept arbitrary preemption source in apply_monotonically() mutation_partition_v2: Simplify get_continuity() row_cache: Distinguish dummy insertion site in trace log db: Use mutation_partition_v2 in mvcc range_tombstone_change_merger: Introduce peek() readers: Extract range_tombstone_change_merger mvcc: partition_snapshot_row_cursor: Handle non-evictable snapshots mvcc: partition_snapshot_row_cursor: Support digest calculation mutation_partition_v2: Store range tombstones together with rows db: Introduce mutation_partition_v2 doc: Introduce docs/dev/mvcc.md db: cache_tracker: Introduce insert() variant which positions before existing entry in the LRU db: Print range_tombstone bounds as position_in_partition test: memtable_test: Relax test_segment_migration_during_flush test: cache_flat_mutation_reader: Avoid timestamp clash test: cache_flat_mutation_reader_test: Use monotonic timestamps when inserting rows test: mvcc: Fix sporadic failures due to compact_for_compaction() test: lib: random_mutation_generator: Produce partition tombstone less often test: lib: random_utils: Introduce with_probability() test: lib: Improve error message in has_same_continuity() test: mvcc: mvcc_container: Avoid UB in tracker() getter when there is no tracker test: mvcc: Insert entries in the tracker test: mvcc_test: Do not set dummy::no on non-clustering rows mutation_partition: Print full position in error report in append_clustered_row() db: mutation_cleaner: Extract make_region_space_guard() position_in_partition: Optimize equality check mvcc: Fix version merging state resetting mutation_partition: apply_resume: Mark operator bool() as explicit	2023-02-05 22:33:10 +02:00
Michał Chojnowski	5edf965526	locator: token_metadata: unify get_address_ranges() and get_ranges() get_address_ranges() and get_ranges() perform almost the same computation. They return the same ranges -- the only difference is that get_address_ranges() returns them in unspecified order, while get_ranges() returns them in sorted order. Therefore the result of get_ranges() is also a valid result for get_address_ranges(), and the two functions can be unified to avoid code duplication. This patch does just that.	2023-02-04 22:55:08 +01:00
Michał Chojnowski	9e57b21e0c	locator: token_metadata: get rid of a quadratic behaviour in get_address_ranges() Some callees of update_pending_ranges use the variant of get_address_ranges() which builds a hashmap of all <endpoint, owned range> pairs. For everywhere_topology, the size of this map is quadratic in the number of endpoints, making it big enough to cause contiguous allocations of tens of MiB for clusters of realistic size, potentially causing trouble for the allocator (as seen e.g. in #12724). This deserves a correction. This patch removes the quadratic variant of get_address_ranges() and replaces its uses with its linear counterpart. Refs #10337 Refs #10817 Refs #10836 Refs #10837 Fixes #12724	2023-02-04 22:38:04 +01:00
Aleksandra Martyniuk	f3fa0d21ef	repair: use tasks::task_manager::module::make_and_start_task for repair tasks Use tasks::task_manager::module::make_and_start_task to create and start repair tasks. Delete start_repair_task static function which did this before.	2023-02-04 14:33:17 +01:00
Aleksandra Martyniuk	cb3b6cdc1a	tasks: add task_manager::module::make_and_start_task method In most cases, tasks manager's tasks are started just after they are created. Thus, to reduce boilerplate required for creating and starting tasks, make_and_start_task method is added.	2023-02-04 14:23:51 +01:00
Jan Ciolek	2a5ed115ca	cql/query_options: add a check for missing bind marker name There was a missing check in validation of named bind markers. Let's say that a user prepares a query like: ```cql INSERT INTO ks.tab (pk, ck, v) VALUES (:pk, :ck, :v) ``` Then they execute the query, but specify only values for `:pk` and `:ck`. We should detect that a value for :v is missing and throw an invalid_request_exception. Until now there was no such check, in case of a missing variable invalid `query_options` were created and Scylla crashed. Sadly it's impossible to create a regression test using `cql-pytest` or `boost`. `cql-pytest` uses the python driver, which silently ignores mising named bind variables, deciding that the user meant to send an UNSET_VALUE for them. When given values like `{'pk': 1, 'ck': 2}`, it will automaticaly extend them to `{'pk': 1, 'ck': 2, 'v': UNSET_VALUE}`. In `boost` I tried to use `cql_test_env`, but it only has methods which take valid `query_options` as a parameter. I could create a separate unit tests for the creation and validation of `query_options` but it won't be a true end-to-end test like `cql-pytest`. The bug was found using the rust driver, the reproducer is available in the issue description. Fixes: #12727 Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com> Closes #12730	2023-02-04 02:13:34 +02:00
Alejo Sanchez	346d02b477	raft conf error injection for snapshot To trigger snapshot limit behavior provide an error injection to set with one-shot. Note this effectively changes it and there is no revert. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-02-03 22:33:33 +01:00
Pavel Emelyanov	d021aaf34d	system_keysace: De-static calls that update view-building tables There's a bunch of them used by mainly view_builder and also by the API and storage_service. All use global qctx to make its job, now when the callers have main-local sharded<system_keysace> references they can be made non-static. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-03 21:56:54 +03:00
Pavel Emelyanov	e2f51ce43e	storage_service: Coroutinize mark_existing_views_as_built() It's a start-only method. Making it coroutine helps further patching. Also restrict the call to be shard-0 only, it's such anyway but lets the code have less nested coroutinized lambdas. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-03 21:55:10 +03:00
Andrii Patsula	e420dbf10b	service/raft: raft_group_registry: use recent_entries_map to store rate_limits in pinger. Fixes #12309	2023-02-03 19:04:51 +01:00
Andrii Patsula	c95066a410	utils: introduce recent_entries_map datatype to track least recent visited entries.	2023-02-03 19:04:32 +01:00
Pavel Emelyanov	b347a0cf0b	api: Unset column_famliy endpoints The API calls in question will use system keyspace, that starts before (and thus stops after) and nowadays indirectly uses database instance that also starts earlier (and also stops later), so this avoids potential dangling references. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-03 18:59:28 +03:00
Pavel Emelyanov	eac2e453f2	api: Carry sharded<db::system_keyspace> reference over There's the column_family/get_built_indexes call that calls a system keyspace method to fetch data from scylla_views_builds_in_progress table, so the system keyspace reference will be needed in the API handler. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-03 18:57:43 +03:00
Pavel Emelyanov	bbbeba103b	view_builder: Add system_keyspace dependency The view builder updates system.scylla_views_builds_in_progress and .built_views tables and thus needs the system keyspace instance. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-02-03 18:55:58 +03:00
Aleksandra Martyniuk	12789adb95	compaction: test: pass task_manager to compaction_manager in test environment Each instance of compaction manager should have compaction module pointer initialized. All contructors get task_manager reference with which the module is created.	2023-02-03 15:15:11 +01:00
Raphael S. Carvalho	5a784c3c6d	treewide: Use new sstable_set::size() wherever possible That's the preferred alternative because it's zero copy. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-02-03 10:38:04 -03:00
Raphael S. Carvalho	909d1975af	sstables: Introduce sstable_set::size() Preferred aternative to sstable_set->all()->size(), which may involve of copy elements from a single set or multiple ones if compound_sstable_set is used. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-02-03 10:38:00 -03:00
Asias He	e7d5e508bc	storage_service: Enable Repair Based Node Operations (RBNO) by default for all node ops Since `97bb2e47ff` (storage_service: Enable Repair Based Node Operations (RBNO) by default for replace), RBNO was enabled by default for replace ops. After more testing, we decided to enable repair based node operations by default for all node operations.	2023-02-03 21:15:08 +08:00
Asias He	fc60484422	test: Increase START_TIMEOUT It is observed that CI machine is slow to run the test. Increase the timeout of adding servers.	2023-02-03 21:15:08 +08:00
Aleksandra Martyniuk	47ef689077	compaction: create and register task manager's module for compaction As an initial part of integration of compaction with task manager, compaction module is added. Compaction module inherits from tasks::task_manager::module and shared_ptr to it is kept in compaction manager. No compaction tasks are created yet.	2023-02-03 13:52:30 +01:00
Aleksandra Martyniuk	6233823cc7	tasks: add task_manager constructor without arguments Sometimes, e.g. for tests, we may need to create task_manager without main-specific arguments.	2023-02-03 13:52:30 +01:00
Aleksandra Martyniuk	8cb319030a	test: rest_api: check if repair of system keyspace returns before corresponding task is created	2023-02-03 13:35:13 +01:00
Aleksandra Martyniuk	aab704d255	repair: finish repair immediately on local keyspaces System keyspace is a keyspace with local replication strategy and thus it does not need to be repaired. It is possible to invoke repair of this keyspace through the api, which leads to runtime error since peer_events and scylla_table_schema_history have different sharding logic. For keyspaces with local replication strategy repair_service::do_repair_start returns immediately.	2023-02-03 13:35:13 +01:00
Kamil Braun	61dfc9c10f	Merge 'docs: extend the warning on using "nodetool removenode"' from Anna Stuchlik This PR extends the description of using `nodetool removenode `to remove an unavailable node, as requested in https://github.com/scylladb/scylla-enterprise/issues/2338. Closes #12410 * github.com:scylladb/scylladb: docs: improve the warning and add a comment to update/remove the information in the future doc: extend the information on removing an unavailable node docs: extend the warning on the Remove a Node page	2023-02-03 12:00:17 +01:00
Kamil Braun	d991f71910	test.py: rely on ScyllaCluster.is_dirty flag for recycling clusters `TopologyTest`s (used by `topology/` suite and friends) already relied on the `is_dirty` flag stored in `ScyllaCluster` thanks to `ScyllaClusterManager` (which passes the flag when returning a cluster to the pool). But `PythonTest`s (cql-pytest/ suite) and `CQLApprovalTest`s (cql/ suite) had different ways to decide whether a cluster should be recycled. For example, `PythonTest` would recycle a cluster if `after_test` raised an exception. This depended on a post-condition check made by `after_test`: it would query the number of keyspaces and throw an exception if it was different than when the test started. If the cluster (which for `PythonTest` is always single-node) was dead, this query would fail. However, we modified the behavior of `after_test` in earlier commits - it no longer preforms the post-condition check on dirty clusters. So it's also no longer reliable to use the exception raised by `after_test` to decide that we should recycle the cluster. Unify the behavior of `PythonTest` and `CQLApprovalTest` with what `TopologyTest` does - using the `is_dirty` flag to decide that we should recycle a cluster. Thanks to earlier commits, this flag is set to `True` whenever a test fails, so it should cover most cases where we want to recycle a cluster. (The only case not currently covered is if a non-dirty cluster crashes after we perform the keyspace post-condition check, which seems quite improbable.) Note that this causes us to recycle clusters more often in these tests: previously, when a `PythonTest` or `CQLApprovalTest` failed, but the cluster was still alive and the post-condition check passed, we would use the cluster for the next test. Now we recycle a cluster whenever a test that used it fails.	2023-02-03 11:49:35 +01:00
Kamil Braun	8442cccd37	test/topology: don't drop random_tables keyspace after a failed test After a failed test, the cluster might be down so dropping the random_tables keyspace might be impossible. The cluster will be marked dirty so it doesn't matter that we leave any garbage there. Note: we already drop only if the cluster is not marked as dirty, and we mark the cluster as dirty after a failed test. However, marking the cluster as dirty after a failed test happens at the end of the `manager` fixture and the `random_tables` fixture depends on the `manager` fixture, so at the end of the `random_tables` fixture the cluster still wasn't marked as dirty. Hence the fixture must access the pytest-provided `request` fixture where we store a flag whether the test has failed.	2023-02-03 11:49:35 +01:00
Anna Stuchlik	84e2178fe9	docs: improve the warning and add a comment to update/remove the information in the future	2023-02-03 09:33:07 +01:00
Botond Dénes	c270c305c0	Merge 'Allow entire test suite to run with multiple compaction groups' from Raphael "Raph" Carvalho New test/lib/scylla_test_case.hh, introduced in "tests: Add command line options for Scylla unit tests", allows extension of the command line options provided by Seastar testing framework. It allows all boost tests to process additional options without changing a single line of code. Patch "test: Add x-log2-compaction-groups to Scylla test command line options" builds on that, allowing all test cases to run with N compaction groups. Again, without changing a line of code in the tests. Now all you have to do is: ./build/dev/test/boost/sstable_compaction_test -- --smp 1 --x-log2-compaction-groups 1 ./test.py --mode=dev --x-log2-compaction-groups 1 --verbose And it will run the test cases with as many groups as you wish. ./test.py passes successfully with parameter --x-log2-compaction-groups 1. Closes #12369 * github.com:scylladb/scylladb: test.py: Add option to run scylla tests with multiple compaction groups test: Add x-log2-compaction-groups to Scylla test command line options test: Enable Scylla test command line options for boost tests tests: Add command line options for Scylla unit tests replica: table: Add debug log for number of compaction groups test: sstable_compaction_test: Fix indentation test: sstable_compaction_test: Make it work with compaction groups test: test_bloom_filter: Fix it with multiple compaction groups test: memtable_test: Fix it with multiple compaction groups	2023-02-03 06:35:15 +02:00
Kefu Chai	d2e3a60428	dist/debian: drop unused Makefile variable `job` was introduced back in `782ebcece4`, so we could consume the option specified in DEB_BUILD_OPTIONS environmental variable. but now that we always repackage the artifacts prebuilt in the relocatable package. we don't build them anymore when packaging debian packages. see `9388f3d626` . and `job` is not passed to `ninja` anymore. so, in this change, `job` is removed from debian/rules as well, as it is not used. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-03 11:18:51 +08:00
Kefu Chai	75eaee040b	dist/debian: bump up debhelper compatibility level to 10 to silence the warnings from dh tools, like ``` dh: warning: Compatibility levels before 10 are deprecated (level 9 in use) dh_clean dh_clean: warning: Compatibility levels before 10 are deprecated (level 9 in use) ``` see https://manpages.debian.org/testing/debhelper/debhelper-compat-upgrade-checklist.7.en.html for the changes in between v9 and v10, none of them applies to our use case. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-03 11:04:43 +08:00
Raphael S. Carvalho	55a8421e3d	table: Fix inefficiency when rebuilding statistics with compaction groups Whenever any compaction group has its SSTable set updated, table's rebuild_statistics() is called and it inefficiently iterates through SSTable set of all compaction groups. Now each compaction group keeps track of its statistics, such that table's rebuild_statistics() only need to sum them up. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-02-02 17:10:11 -03:00
Raphael S. Carvalho	529a1239a9	table: Fix disk-space related metrics total disk space used metric is incorrectly telling the amount of disk space ever used, which is wrong. It should tell the size of all sstables being used + the ones waiting to be deleted. live disk space used, by this defition, shouldn't account the ones waiting to be deleted. and live sstable count, shouldn't account sstables waiting to be deleted. Fix all that. Fixes #12717. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-02-02 16:38:45 -03:00
Raphael S. Carvalho	55cd163392	sstables: Fix fragility of sstable_set::all() interface all() was returning lw_shared_ptr<sstable_list> which allowed caller to modify sstable set content, which will mess up everything. sstable_set is supposed to be only modifed through insert and erase functions. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-02-02 15:46:08 -03:00
Alejo Sanchez	9ceb6aba81	test/pylib: one-shot error injection helper Existing helper with async context manager only worked for non one-shot error injections. Fix it and add another helper for one-shot without a context manager. Fix tests using the previous helper. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-02-02 16:37:21 +01:00
Kamil Braun	a9dbd89478	test/pylib: mark cluster as dirty after a failed test We don't expect the cluster to be functioning at all after a failed test. The whole cluster might have crashed, for example. In these situations the framework would report multiple errors (one for the actual failure, another for a failed post-condition check because the cluster was down) which would only obscure the report and make debugging harder. It's also not safe in general to reuse the cluster in another test - if the test previous failed, we should not assume that it's in a valid state. Therefore, mark the cluster as dirty after a failed test. This will let us recycle the cluster based on the dirty flag and it will disable post-condition check after a failed test (which is only done on non-dirty clusters). To implement this in topology tests, we use the `pytest_runtest_makereport` hook which executes after a test finishes but before fixtures finish. There we store a test-failed flag in a stash provided by pytest, then access the flag in the `manager` fixture.	2023-02-02 16:35:55 +01:00
Kamil Braun	977375d13f	test: pylib, topology: don't perform operations after test on a dirty cluster `after_test` would count keyspaces and check that the number is the same as before the test started. The `random_tables` fixture after a test would drop the keyspace that it created before the test. These steps are done to ensure that the cluster is ready to be reused for the next steps. If the cluster is dirty, it cannot be reused anyway, so the steps are unnecessary. They might also be impossible in general - a dirty cluster might be completely dead. For example, the attempts to drop a keyspace from `random_tables` would cause confusing errors if a test failed when it tried to restart a node while all nodes were down, making it harder to find the 'real' failure. Therefore don't perform these operations if the cluster is dirty.	2023-02-02 15:59:02 +01:00
Kamil Braun	f4b56cddde	test/pylib: print cluster at the end of test - print the cluster used by the test in `after_test` - if cluster setup fails in `before_test`, print the cluster together with the exception (`after_test` is not executed if `before_test` fails)	2023-02-02 15:59:02 +01:00
Anna Stuchlik	f4c5cdf21b	doc: add the info about the minor versions	2023-02-02 14:16:40 +01:00
Avi Kivity	f5fd0769b2	Merge 'cql3: expr: don't pass empty evaluation_inputs in is_one_of' from Jan Ciołek `evaluation_inputs` is a struct which contains data needed to evaluate expressions - values of columns, bind variables and other data. `is_on_of()` is a function used to to evaluate `IN` restrictions. It checks whether the LHS is one of elements on the RHS list. Generally when evaluating expressions we get the `evaluation_inputs` as an argument and we should pass them along to any functions that evaluate subexpressions. `is_one_of()` got the inputs as an argument, but didn't pass them along to `equal()`, instead it creates new empty `evaluation_inputs{}` and gives that to `equal()`. At first [I thought this was a bug](https://github.com/scylladb/scylladb/pull/12356#discussion_r1084300969) - with missing information there could be a crash if `equal()` tried to evaluate an expression with a `bind_variable`. It turns out that in this particular case `equal()` won't use the `evaluation_inputs` at all. The LHS and RHS passed to it are just constant values, which were already evaluated to serialized bytes before calling `evaluate()`, so there is no bug. It's still better to pass the inputs argument along if possible. If in the future `equal()` required these inputs for some reason, missing inputs could lead to an unexpected crash. I couldn't find any tests that would detect this case, so such a bug could stay undetected until an unhappy user finds it because their cluster crashed. I added some tests to make sure that it's covered from now on. Closes #12701 * github.com:scylladb/scylladb: cql-pytest: test filtering using list with bind variable test/expr_test: test <int_value> IN (123, ?, 456) cql3: expr: don't pass empty evaluation_inputs in is_one_of	2023-02-02 11:40:20 +02:00
Botond Dénes	9efbcfa190	Merge 'test/alternator: tests for Limit parameter of ListStreams operation' from Nadav Har'El The first patch in this series enables a previously-skipped test for what happens with Limit=0. The test passes. The second patch adds an xfailng test for very large Limit. Closes #12625 * github.com:scylladb/scylladb: test/alternator: xfailing test for huge Limit in ListStreams alternator/test: un-skip test of zero Limit in ListStreams	2023-02-02 07:02:28 +02:00
Asias He	6d7b4a896e	test: Increase max-networking-io-control-blocks The number is too low in the test and we saw rpc: Connection is closed error Inrease the number to the default 1000.	2023-02-02 11:11:22 +08:00
Asias He	693d71984f	storage_service: Check node has left in node_ops_cmd::decommission_done In test with ring delay zero, it is possible that when the node_ops_cmd::decommission_done is received, the nodes remained in the cluster haven't learned the LEFT status for the leaving node yet. To guarantee when the decommission restful api returns, all the nodes participated the decommission operation have learned the LEFT status, a check in the node_ops_cmd::decommission_done is added in this patch. After this patch, the decommission tests which start multiple decommission in a loop with ring delay zero in test/topology/test_topology.py passes.	2023-02-02 11:11:22 +08:00
Asias He	e2e5017c54	repair: Use remote dc neighbors for everywhere strategy Consider: - Bootstrap n1 in dc 1 - Create ks with EverywhereStrategy - Bootstrap n2 in dc 2 Since n2 is the first node in dc2, there will be no local dc nodes to sync data from. In this case, n2 should sync data with node in dc 1 even if it is in the remote dc.	2023-02-02 11:10:50 +08:00
Raphael S. Carvalho	e3923a9caf	test.py: Add option to run scylla tests with multiple compaction groups The tests can now optionally run with multiple groups via option --x-log2-compaction-groups. This includes boost tests and the ones which run against either one (e.g. cql) or many instances (e.g. topology). Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-02-01 20:17:16 -03:00
Raphael S. Carvalho	f510cab5f0	test: Add x-log2-compaction-groups to Scylla test command line options Now any boost test can run with multiple compaction groups by default, without any change in the boost test cases whatsoever. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-02-01 20:14:51 -03:00
Raphael S. Carvalho	3c5afb2d5c	test: Enable Scylla test command line options for boost tests We have enabled the command line options without changing a single line of code, we only had to replace old include with scylla_test_case.hh. Next step is to add x-log-compaction-groups options, which will determine the number of compaction groups to be used by all instantiations of replica::table. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-02-01 20:14:51 -03:00
Raphael S. Carvalho	a2c60b6cf5	tests: Add command line options for Scylla unit tests Scylla unit tests are limited to command line options defined by Seastar testing framework. For extending the set of options, Scylla unit tests can now include test/lib/scylla_test_case.hh instead of seastar/testing/test_case.hh, which will "hijack" the entry point and will process the command line options, then feed the remaining options into seastar testing entry point. This is how it looks like when asking for help: Scylla tests additional options: --help Produces help message --x-log2-compaction_groups arg (=0) Controls static number of compaction groups per table per shard. For X groups, set the option to log (base 2) of X. Example: Value of 3 implies 8 groups. Running 1 test case... App options: -h [ --help ] show help message --help-seastar show help message about seastar options --help-loggers print a list of logger names and exit --random-seed arg Random number generator seed --fail-on-abandoned-failed-futures arg (=1) Fail the test if there are any abandoned failed futures Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-02-01 20:14:51 -03:00
Raphael S. Carvalho	8988795b08	replica: table: Add debug log for number of compaction groups Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-02-01 20:14:51 -03:00
Raphael S. Carvalho	a7ddedb998	test: sstable_compaction_test: Fix indentation Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-02-01 20:14:51 -03:00
Raphael S. Carvalho	c455e43f49	test: sstable_compaction_test: Make it work with compaction groups Tests using replica::table::add_sstable_and_update_cache() cannot rely on the sstable being added to a single compaction group, if the test was forced to run with multiple groups. Additionally let's remove try_flush_memtable_to_sstable() which is retricted to a single group, allowing the entire test to now pass with multiple groups. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-02-01 20:14:51 -03:00
Raphael S. Carvalho	c25d8614a9	test: test_bloom_filter: Fix it with multiple compaction groups With many compaction groups, the data:filter size ratio becomes small with a small number of keys. Test is adjusted to run another check with more keys if efficiency is higher than expected, but not lower. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-02-01 20:14:51 -03:00
Raphael S. Carvalho	2d2460046b	test: memtable_test: Fix it with multiple compaction groups With compaction groups, automatic flushing may not pick the user table. Fix it by using explicit flush. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-02-01 20:14:51 -03:00
Botond Dénes	34cdcaffae	reader_concurrency_semaphore: un-bless permits when they become inactive When the memory consumption of the semaphore reaches the configured serialize threshold, all but the blessed permit is blocked from consuming any more memory. This ensures that past this limit, only one permit at a time can consume memory. Such a blessed permit can be registered inactive. Before this patch, it would still retain its blessed status when doing so. This could result in this permit being re-queued for admission if it was evicted in the meanwhile, potentially resulting in a complete deadlock of the semaphore: * admission queue permits cannot be admitted because there is no memory * admitter permits are all queued on memory, as none of them are blessed This patch strips the blessed status from the permit when it is registered as inactive. It also adds a unit test to verify this happens. Fixes: #12603 Closes #12694	2023-02-01 21:02:17 +02:00
Botond Dénes	693c22595a	sstables/sstable: validate_checksums(): force-check EOF EOF is only guarateed to be set if one tried to read past the end of the file. So when checking for EOF, also try to read some more. This should force the EOF flag into a correct value. We can then check that the read yielded 0 bytes. This should ensure that `validate_checksums()` will not falsely declare the validation to have failed. Fixes: #11190 Closes #12696	2023-02-01 20:52:46 +02:00
Nadav Har'El	69517040f7	Merge 'alterator::streams: Sort tables in list_streams to ensure no duplicates' from Calle Wilund Fixes #12601 (maybe?) Sort the set of tables on ID. This should ensure we never generate duplicates in a paged listing here. Can obviously miss things if they are added between paged calls and end up with a "smaller" UUID/ARN, but that is to be expected. Closes #12614 * github.com:scylladb/scylladb: alternator::streams: Special case single table in list_streams alternator::streams: Only sort tables iff limit < # tables or ExclusiveStartStreamArn set alternator::streams: Set default list_streams limit to 100 as per spec alterator::streams: Sort tables in list_streams to ensure no duplicates	2023-02-01 19:47:16 +02:00
Wojciech Mitros	86c61828e6	udt: disallow dropping a user type used in a user function Currently, nothing prevents us from dropping a user type used in a user function, even though doing so may make us unable to use the function correctly. This patch prevents this behavior by checking all function argument and return types when executing a drop type statement and preventing it from completing if the type is referenced by any of them. Closes #12680	2023-02-01 18:53:29 +02:00
Kefu Chai	53366db6c6	build: disable Seastar's io_uring backend again this partially reverts `49157370bc` according the reports in #12173, at least two developers ran into test failures which are correlated with the lastest Seastar change, which enables the io_uring backend by default. they are using linux kernel 6.0.12 and 6.1.7. it's also reported that reverting the the commit of eedca15f16c3b6eae3d3d8af9510624a93f5d186 in seastar helps. that very commit enables the io_uring by default. although we are not able to identify the exact root cause of the failures in #12173 at this moment, to rule out the potential problem of io_uring should help with further investigation. in this change, io_uring backend is disabled when building Seastar. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12689	2023-02-01 17:36:07 +02:00
Jan Ciolek	ed568f3f70	cql-pytest: test filtering using list with bind variable Add tests which test filtering using IN restriction with a list which contains a bind variable. There are other cql-pytest tests which test IN lists with a bind variable, but it looks like they don't do filtering. IN restrictions on primary key columns are handled in a special way to generate the right ranges. These tests hit a different code path as filtering uses `expr::evaluate()`. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-02-01 16:30:09 +01:00
Jan Ciolek	9eb6746a67	test/expr_test: test <int_value> IN (123, ?, 456) Add tests which test evaluating the IN restriction with a list which contains a bind variable. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-02-01 16:29:32 +01:00
Jan Ciolek	286599fe8b	cql3: expr: don't pass empty evaluation_inputs in is_one_of evaluation_inputs is a struct which contains data needed to evaluate expressions - values of columns, bind variables and other data. is_on_of() is a function used to to evaluate IN restrictions. It checks whether the LHS is one of elements on the RHS list. Generally when evaluating expressions we get the evaluation_inputs{} as an argument and we should pass them along to any functions that evaluate subexpressions. is_one_of() got the inputs as an argument, but didn't pass them along to equal(), instead it creates new empty evaluation_inputs{} and gives that to equal(). At first I thought this was a bug - with missing information there could be a crash if equal() tried to evaluate an expression with a bind_variable. It turns out that in this particular case equal() won't use the evaluation_inputs{} at all. The LHS and RHS passed to it are just constant values, which were already evaluated to serialized bytes before calling evaluate(). It's still better to pass the inputs argument along if possible. If in the future equal() required these inputs for some reason, missing inputs could lead to an unexpected crash. I couldn't find any tests that would detect this case, so such a bug could stay undetected until an unhappy user finds it because their cluster crashed. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-02-01 16:20:24 +01:00
Avi Kivity	b4559a6992	Update seastar submodule * seastar 943c09f869...ef24279f03 (6): > Merge 'util/print_safe, reactor: use concept for type constraints and refactory ' from Kefu Chai > Right align the memory diagnostics > Merge 'Add an API for the metrics layer to manipulate metrics dynamically.' from Amnon Heiman > semaphore: assert no outstanding units when moved > build: do not populate package registry by default > build: stop detecting concepts support Closes #12695	2023-02-01 17:19:49 +02:00
Kamil Braun	40142a51d0	test: topology: wait for token ring/group 0 consistency after decommission There was a check for immediate consistency after a decommission operation has finished in one of the tests, but it turns out that also after decommission it might take some time for token ring to be updated on other nodes. Replace the check with a wait. Also do the wait in another test that performs a sequence of decommissions. We won't attempt to start another decommission until every node learns that the previously decommissioned node has left. Closes #12686	2023-02-01 16:49:22 +02:00
Raphael S. Carvalho	1b2140e416	compaction: Fix inefficiency when updating LCS backlog tracker LCS backlog tracker uses STCS tracker for L0. Turns out LCS tracker is calling STCS tracker's replace_sstables() with empty arguments even when higher levels (> 0) only had sstables replaced. This unnecessary call to STCS tracker will cause it to recompute the L0 backlog, yielding the same value as before. As LCS has a fragment size of 0.16G on higher levels, we may be updating the tracker multiple times during incremental compaction, which operates on SSTables on higher levels. Inefficiency is fixed by only updating the STCS tracker if any L0 sstable is being added or removed from the table. This may be fixing a quadratic behavior during boot or refresh, as new sstables are loaded one by one. Higher levels have a substantial higher number of sstables, therefore updating STCS tracker only when level 0 changes, reduces significantly the number of times L0 backlog is recomputed. Refs #12499. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #12676	2023-02-01 15:19:07 +02:00
Michael Hollander	5d1e40bc18	Added missing full stop to SimpleSnitch paragraph Closes #12692	2023-02-01 13:21:49 +02:00
Nadav Har'El	132af20057	Merge 'test/pylib: scylla_cluster: ensure there's space in the cluster pool when running a sequence of tests' from Kamil Braun `ScyllaClusterManager` is used to run a sequence of test cases from a single test file. Between two consecutive tests, if the previous test left the cluster 'dirty', meaning the cluster cannot be reused, it would free up space in the pool (using `steal`), stop the cluster, then get a new cluster from the pool. Between the `steal` and the `get`, a concurrent test run (with its own instance of `ScyllaClusterManager` would start, because there was free space in the pool. This resulted in undesirable behavior when we ran tests with `--repeat X` for a large `X`: we would start with e.g. 4 concurrent runs of a test file, because the pool size was 4. As soon as one of the runs freed up space in the pool, we would start another concurrent run. Soon we'd end up with 8 concurrent runs. Then 16 concurrent runs. And so on. We would have a large number of concurrent runs, even though the original 4 runs didn't finish yet. All of these concurrent runs would compete waiting on the pool, and waiting for space in the pool would take longer and longer (the duration is linear w.r.t number of concurrent competing runs). Tests would then time out because they would have to wait too long. Fix that by using the new `replace_dirty` function introduced to the pool. This function frees up space by returning a dirty cluster and then immediately takes it away to be used for a new cluster. Thanks to this, we will only have at most as many concurrent runs as the pool size. For example with --repeat 8 and pool size 4, we would run 4 concurrent runs and start the 5th run only when one of the original 4 runs finishes, then the 6th run when a second run finishes and so on. The fix is preceded by a refactor that replaces `steal` with `put(is_dirty=True)` and a `destroy` function passed to the pool (now the pool is responsible for stopping the cluster and releasing its IPs). Fixes #11757 Closes #12549 * github.com:scylladb/scylladb: test/pylib: scylla_cluster: ensure there's space in the cluster pool when running a sequence of tests test/pylib: pool: introduce `replace_dirty` test/pylib: pool: replace `steal` with `put(is_dirty=True)`	2023-02-01 12:37:39 +02:00
Anna Stuchlik	b346778ae8	doc: add the missing sudo command	2023-02-01 10:43:39 +01:00
Nadav Har'El	681a066923	test/pylib: put UNIX-domain socket in /tmp The "cluster manager" used by the topology test suite uses a UNIX-domain socket to communicate between the cluster manager and the individual tests. The socket is currently located in the test directory but there is a problem: In Linux the length of the path used as a UNIX-domain socket address is limited to just a little over 100 bytes. In Jenkins run, the test directory names are very long, and we sometimes go over this length limit and the result is that test.py fails creating this socket. In this patch we simply put the socket in /tmp instead of the test directory. We only need to do this change in one place - the cluster manager, as it already passes the socket path to the individual tests (using the "--manager-api" option). Tested by cloning Scylla in a very long directory name. A test like ./test.py --mode=dev test_concurrent_schema fails before this patch, and passes with it. Fixes #12622 Closes #12678	2023-02-01 12:37:35 +03:00
Botond Dénes	325246ab2a	Merge 'doc: fix the service name from "scylla-enterprise-server" "to "scylla-server"' from Anna Stuchlik Related https://github.com/scylladb/scylladb/issues/12658. This issue fixes the bug in the upgrade guides for the released versions. Closes #12679 * github.com:scylladb/scylladb: doc: fix the service name in the upgrade guide for patch releases versions 2022 doc: fix the service name in the upgrade guide from 2021.1 to 2022.1	2023-02-01 12:37:35 +03:00
Anna Stuchlik	2be131da83	doc: fixes https://github.com/scylladb/scylladb/issues/12672 , fix the redirects to the Cloud docs Closes #12673	2023-02-01 12:37:35 +03:00
Botond Dénes	d8073edbb7	Merge 'cql3, locator: call fmt::format_to() explicitly and include used headers' from Kefu Chai these fixes address the FTBFS of scylla with GCC-13. Closes #12669 * github.com:scylladb/scylladb: cql3/stats: include the used header. cql3, locator: call fmt::format_to() explicitly	2023-02-01 12:37:35 +03:00
Pavel Emelyanov	d065f9f82e	sstables: The generation_type is not formattable If TOC writing hits TOC file conflict it tries to throw an exception with sstable generation in it. However, generation_type is not formattable at all, let alone the {:d} option.pick This bug generates an obscure 'fmt::v9::format_error (invalid type specifier)' error in unknown location making the debugging hard. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #12671	2023-02-01 12:37:35 +03:00
Kefu Chai	186ceea009	cql3/selection: construct string_view using char* not size before this change, we construct a sstring from a comma statement, which evaluates to the return value of `name.size()`, but what we expect is `sstring(const char, size_t)`. in this change instead of passing the size of the string_view, both its address and size are used * `std::string_view` is constructed instead of sstring, for better performance, as we don't need to perform a deep copy the issue is reported by GCC-13: ``` In file included from cql3/selection/selectable.cc:11: cql3/selection/field_selector.hh:83:60: error: ignoring return value of function declared with 'nodiscard' attribute [-Werror,-Wunused-result] auto sname = sstring(reinterpret_cast<const char*>(name.begin(), name.size())); ^~~~~~~~~~ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12666	2023-02-01 12:37:35 +03:00
David Garcia	616bf26422	docs: add opensource flag Closes #12656	2023-02-01 12:37:35 +03:00
Anna Stuchlik	e81b586d6a	Merge branch 'scylladb:master' into anna-pinning-workaround	2023-02-01 10:36:44 +01:00
Anna Stuchlik	11a59bcc76	doc: fix the service name in the upgrade guide for patch releases versions 2022	2023-01-31 11:04:21 +01:00
Anna Stuchlik	71ae644d40	doc: fix the service name in the upgrade guide from 2021.1 to 2022.1	2023-01-31 10:46:44 +01:00
Kefu Chai	58b4dc5b9a	cql3/stats: include the used header. otherwise `uint64_t` won't be found when compiling with GCC-13. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-01-30 21:50:23 +08:00
Kefu Chai	ccc03dd1ec	cql3, locator: call fmt::format_to() explicitly since format_to() is defined included by both fmt and std namepaces, without specifying which one to use, we'd fail to build with the standard library which implements std::format_to(). yes, we are `using namespace std` somewhere. this change should address the FTBFS with GCC-13. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-01-30 21:50:11 +08:00
Warren Krewenki	8655a8be19	docs: Update suggested AWS instance types in benchmark tips The list of suggested instances had a misspelling of c5d, and didn't include the i4i instances recommended by https://www.scylladb.com/2022/05/09/scylladb-on-the-new-aws-ec2-i4i-instances-twice-the-throughput-lower-latency/ Closes #12664	2023-01-30 14:10:18 +02:00
Botond Dénes	c927eea1d5	Merge 'table: trim ranges for compaction group cleanup' from Benny Halevy This series contains the following changes for trimming the ranges passed to cleanup a compaction group to the compaction group owned token_range. table: compaction_group_for_token: use signed arithmetic Fixes #12595 table: make_compaction_groups: calculate compaction_group token ranges table: perform_cleanup_compaction: trim owned ranges on compaction_group boundaries Fixes #12594 Closes #12598 * github.com:scylladb/scylladb: table: perform_cleanup_compaction: trim owned ranges on compaction_group boundaries table: make_compaction_groups: calculate compaction_group token ranges dht: range_streamer: define logger as static	2023-01-30 13:11:28 +02:00
Anna Stuchlik	64cc4c8515	docs: fixes https://github.com/scylladb/scylladb/issues/12654 , update the links to the Download Center Closes #12655	2023-01-30 12:45:20 +02:00
Michał Chojnowski	fa7e904cd6	commitlog: fix total_size_on_disk accounting after segment file removal Currently, segment file removal first calls `f.remove_file()` and does `total_size_on_disk -= f.known_size()` later. However, `remove_file()` resets `known_size` to 0, so in effect the freed space in not accounted for. `total_size_on_disk` is not just a metric. It is also responsible for deciding whether a segment should be recycled -- it is recycled only if `total_size_on_disk - known_size < max_disk_size`. Therefore this bug has dire performance consequences: if `total_size_on_disk - known_size` ever exceeds `max_disk_size`, the recycling of commitlog segments will stop permanently, because `total_size_on_disk - known_size` will never go back below `max_disk_size` due to the accounting bug. All new segments from this point will be allocated from scratch. The bug was uncovered by a QA performance test. It isn't easy to trigger -- it took the test 7 hours of constant high load to step into it. However, the fact that the effect is permanent, and degrades the performance of the cluster silently, makes the bug potentially quite severe. The bug can be easily spotted with Prometheus as infinitely rising `commitlog_total_size_on_disk` on the affected shards. Fixes #12645 Closes #12646	2023-01-30 12:20:04 +02:00
Botond Dénes	71ad0dff2b	test/lib/sstable_utils: remove now unused token_generation_for_shard() and friends	2023-01-30 05:03:42 -05:00
Botond Dénes	a03c11234d	test/lib/simple_schema: remove now unused make_keys() and friends	2023-01-30 05:03:42 -05:00
Botond Dénes	4ad3ba52b0	test: migrate to tests::generate_partition_key[s]() Use the newly introduced key generation facilities, instead of the the old inflexible alternatives and hand-rolled code. Most of the migrations are mechanic, but there are two tests that were tricky to migrate: * sstable_compaction_test.sstable_run_based_compaction_test * sstable_mutation_test.test_key_count_estimation These two tests seems to depend on generated keys all being of the same size. This makes some sense in the case of the key count estimation test, but makes no sense at all to me in the case of the sstable run test.	2023-01-30 05:03:42 -05:00
Botond Dénes	84c94881b3	test/lib/test_services: add table_for_tests::make_default_schema() Creating the default schema, used in the default constructor of table_for_tests. Allows for getting the default schema without creating an instance first.	2023-01-30 05:03:42 -05:00
Botond Dénes	61f28d3ab2	test/lib: add key_utils.hh Contains methods to generate partition and clustering keys. In the case of the former, one can specify the shard to generate keys for. We currently have some methods to generate these but they are not generic. Therefore the tests are littered by open-coded variants. The methods introduced here are completely generic: they can generate keys for any schema.	2023-01-30 05:03:42 -05:00
Anna Stuchlik	0294b426b9	doc: replace the reduntant link with an alternative way to install a non-latest version	2023-01-30 10:01:17 +01:00
Botond Dénes	04ca710a95	test/lib/random_schema.hh: value_generator: add min_size_in_bytes Allow caller to specify the minimum size in bytes of the generated value. Only really works with string-like types (and collections of these). Also fixed max size enforcement for strings: before this patch, the provided max size was dividied by wide string size, instead of the char width of the actual string type the value is generated for.	2023-01-30 01:11:31 -05:00
Avi Kivity	5d914adcef	Merge 'view: row_lock: lock_ck: find or construct row_lock under partition lock' from Benny Halevy Since we're potentially searching the row_lock in parallel to acquiring the read_lock on the partition, we're racing with row_locker::unlock that may erase the _row_locks entry for the same clustering key, since there is no lock to protect it up until the partition lock has been acquired and the lock_partition future is resolved. This change moves the code to search for or allocate the row lock _after_ the partition lock has been acquired to make sure we're synchronously starting the read/write lock function on it, without yielding, to prevent this use-after-free. This adds an allocation for copying the clustering key in advance that wasn't needed before if the lock for it was already found, but the view building is not on the hot path so we can tolerate that. This is required on top of `5007ded2c1` as seen in https://github.com/scylladb/scylladb/issues/12632 which is closely related to #12168 but demonstrates a different race causing use-after-free. Fixes #12632 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #12639 * github.com:scylladb/scylladb: view: row_lock: lock_ck: try_emplace row_lock entry view: row_lock: lock_ck: find or construct row_lock under partition lock	2023-01-29 18:38:14 +02:00
Warren Krewenki	2b7a7e52f4	docs: Missing closing quote in example query Closes #12663	2023-01-29 11:50:11 +02:00
Tomasz Grabiec	c9c476afd7	test: mvcc: Extend some scenarios with exhaustive consistency checks on eviction	2023-01-27 21:56:31 +01:00
Tomasz Grabiec	80de99cb1b	test: mvcc: Extract mvcc_container::allocate_in_region()	2023-01-27 21:56:31 +01:00
Tomasz Grabiec	7bb975eb22	row_cache, lru: Introduce evict_shallow() Will be used by MVCC tests which don't want (can't) deal with the row_cache as the container but work with the partition_entry directly. Currently, rows_entry::on_evicted() assumes that it's embedded in row_cache and would segfault when trying to evict the contining partition entry which is not embedded in row_cache. The solution is to call evict_shallow() from mvcc_tests, which does not attempt to evict the containing partition_entry.	2023-01-27 21:56:31 +01:00
Tomasz Grabiec	f2832046e9	test: mvcc: Avoid copies of mutation under failure injection Speeds up the test a bit because we avoid the copy when converting to mutation_partition_v2 in apply().	2023-01-27 21:56:31 +01:00
Tomasz Grabiec	b8980f68f0	test: mvcc: Add missing logalloc::reclaim_lock to test_apply_is_atomic	2023-01-27 21:56:31 +01:00
Tomasz Grabiec	d02d668777	mutation_partition_v2: Avoid full scan when applying mutation to non-evictable For non-evictable snapshots all ranges are continuous so there is no need to apply the continuity flag to the previous interval if the source mutation has the interval marked as continuous. Without this, applying a single row mutation to a memtable would involve scanning exisiting version for the range before the row's key. This makes population quadratic. This is severed by the fact that this scan will happen in the background if preempted, which exposes a scheduling problem. The mutation cleaner worker which merges versions in the background will not keep up with the incoming writes. This will lead to explosion of partition versions, which makes reads (e.g. memtable flush) very slow. The read will have to refresh the iterator heap, which has an iterator for each version, across every preemption point, because cleaning invalidates iterators. The same could happen before the v2 representation, but for much less typical workloads, e.g. applying lots of mutations with a single range tombstone covering existing rows. The problem was hit in index_with_paging_test in debug mode. It's less likely to happen in release mode where preemption is not triggered as often.	2023-01-27 21:56:31 +01:00
Tomasz Grabiec	bc35fa7696	Pass is_evictable to apply()	2023-01-27 21:56:31 +01:00
Tomasz Grabiec	2b5e7a684b	tests: mutation_partition_v2: Introduce test_external_memory_usage_v2 mirroring the test for v1	2023-01-27 21:56:31 +01:00
Tomasz Grabiec	81b1b2ee55	tests: mutation: Fix test_external_memory_usage() to not measure mutation object footprint The test measured copying of the mutation object, but verified the measurement against mutation_partition::external_memory_usage(). So anything allocated on the mutation object level would cause the test to (incorrectly) fail. Fix that by copying only the mutation_partition part. Currently not a problem, because the partition_key is stored in the in-line storage. Would become a problem once inline storage is reduced.	2023-01-27 21:56:31 +01:00
Tomasz Grabiec	f172336b32	tests: mutation_partition_v2: Add test for exception safety of mutation merging	2023-01-27 21:56:31 +01:00
Tomasz Grabiec	919ff433d1	tests: Add tests for the mutation_partition_v2 model	2023-01-27 21:56:31 +01:00
Tomasz Grabiec	cec9b2d114	mutation_partition_v2: Implement compact() For convenience, will be used in unit tests.	2023-01-27 21:56:31 +01:00
Tomasz Grabiec	4317999ca4	cache_tracker: Extract insert(mutation_partition_v2&)	2023-01-27 21:56:31 +01:00
Tomasz Grabiec	c7f7377ea3	mvcc, mutation_partition: Document guarantees in case merging succeeds It's not obvious that invariants for partial merge do not hold for a completed merge. This is due to the fact that an empty source partition, which is always empty after merge, is always fully continuous.	2023-01-27 21:56:31 +01:00
Tomasz Grabiec	8ae78ffebd	mutation_partition_v2: Accept arbitrary preemption source in apply_monotonically() Will be useful in testing to exhaustivaly test preemption scenarios.	2023-01-27 21:56:31 +01:00
Tomasz Grabiec	8883ac30cf	mutation_partition_v2: Simplify get_continuity()	2023-01-27 21:56:31 +01:00
Tomasz Grabiec	d9e27abe87	row_cache: Distinguish dummy insertion site in trace log	2023-01-27 21:56:31 +01:00
Tomasz Grabiec	026f8cc1e7	db: Use mutation_partition_v2 in mvcc This patch switches memtable and cache to use mutation_partition_v2, and all affected algorithms accordingly. The memtable reader was changed to use the same cursor implementation which cache uses, for improved code reuse and reducing risk of bugs due to discrepancy of algorithms which deal with MVCC. Range tombstone eviction in cache has now fine granularity, like with rows. Fixes #2578 Fixes #3288 Fixes #10587	2023-01-27 21:56:28 +01:00
Tomasz Grabiec	ccf3a13648	range_tombstone_change_merger: Introduce peek() Returns the current tombstone without affecting state.	2023-01-27 19:15:39 +01:00
Tomasz Grabiec	42f5a7189d	readers: Extract range_tombstone_change_merger	2023-01-27 19:15:39 +01:00
Tomasz Grabiec	6b7473be53	mvcc: partition_snapshot_row_cursor: Handle non-evictable snapshots This is a prerequisite for using the cursor in memtable readers. Non-evictable snapshots are those which live in memtables. Unlike evictable snapshots, they don't have a dummy entry at position after all clustering rows. In evictable snapshots, lookup always finds an entry, not so with non-evictable snapshots. The cursor was not prepared for this case, this patch handles it.	2023-01-27 19:15:39 +01:00
Tomasz Grabiec	091ad8f6ee	mvcc: partition_snapshot_row_cursor: Support digest calculation Prerequisite for using in memtable reader.	2023-01-27 19:15:39 +01:00
Tomasz Grabiec	195b40315a	mutation_partition_v2: Store range tombstones together with rows This patch changes mutation_partition_v2 to store range tombstone information together with rows. This mainly affects the version merging algorithm, mutation_partition_v2::apply_monotonically(). Continuity setting no longer can drop dummy entry unconditionally since it may be a boundary of a range tombstone. Memtable/cache is not switched yet. Refs #10587 Refs #3288	2023-01-27 19:15:39 +01:00
Tomasz Grabiec	7e6056b3cc	db: Introduce mutation_partition_v2 Intended to be used in memtable/cache, as opposed to the old mutation_partition which will be intended to be used as temporary object. The two will have different trade-offs regarding memory efficiency and algorithms. In this commit there is no change in logic, the class is mostly copied. Some methods which are not needed on the v2 model were removed from the interface. Logic changes will be introduced in later commits.	2023-01-27 19:15:39 +01:00
Tomasz Grabiec	806f698272	doc: Introduce docs/dev/mvcc.md This extracts information which was there in row_cache.md, but is relevant to MVCC in general. It also makes adaptations and reflects the upcoming changes in this series related to switching to the new mutation_partition_v2 model: - continuity in evictable snapshots can now overlap. This is needed to represent range tombstone information, which is linked to continuity information. - description of range tombstone representation was added	2023-01-27 19:15:39 +01:00
Tomasz Grabiec	27882ff19e	db: cache_tracker: Introduce insert() variant which positions before existing entry in the LRU	2023-01-27 19:15:39 +01:00
Tomasz Grabiec	a574a1cc4e	db: Print range_tombstone bounds as position_in_partition It's the standard now which replaced bound_view. Will be consistent with how range tombstone bounds are represented in mutation_partition_v2 (as rows_entry::position()).	2023-01-27 19:15:39 +01:00
Tomasz Grabiec	40719c600c	test: memtable_test: Relax test_segment_migration_during_flush Partition version merging can now insert sentinels, which may temporarily increase unspooled memory. It is no longer true that unspooled monotonically decreases, which the test verified. Relax it, and only verify that unspooled is smaller than real dirty.	2023-01-27 19:15:39 +01:00
Tomasz Grabiec	31bcc3b861	test: cache_flat_mutation_reader: Avoid timestamp clash api::new_timestamp() is not monotonic. In test_single_row_and_tombstone_not_cached_single_row_range1, we generate a deletion and an insertion in the deleted reange. If they get the same timestamp, the inserted row will be covered. This will surface after cache starts to compact rows with range tombstones.	2023-01-27 19:15:38 +01:00
Tomasz Grabiec	25683449e4	test: cache_flat_mutation_reader_test: Use monotonic timestamps when inserting rows When inserting range tombstones, the test uses api::new_timestamp(), but when inserting rows, it uses a fixed timestamp of 1. This will be problematic when rows get compacted with range tombstone, all rows would get compacted away, which is not expected by the test. To fix this, let's use the same timestamp source as range tombstones. This way rows will get a later timestamp.	2023-01-27 19:15:38 +01:00
Tomasz Grabiec	71057412ed	test: mvcc: Fix sporadic failures due to compact_for_compaction() compact_for_compaction() will perform cell expiration based on gc_clock::now(), which introduces sporadic mismatches due to expiry status of a row marker. Drop this, we can rely on compaction done by is_equal_to_compacted()	2023-01-27 19:15:38 +01:00
Tomasz Grabiec	f908713290	test: lib: random_mutation_generator: Produce partition tombstone less often This tombstone has a high chance of obliterating all data, which will make tests which involve partition version merging not very interesting. The result will be an empty partition with a tombstone. Reduce its frequency, so that in MVCC there is a significant chance of having live data in the combined entry where individual versions come from the generator.	2023-01-27 19:15:38 +01:00
Tomasz Grabiec	3bf8052be4	test: lib: random_utils: Introduce with_probability()	2023-01-27 19:15:38 +01:00
Tomasz Grabiec	c386874e18	test: lib: Improve error message in has_same_continuity()	2023-01-27 19:15:38 +01:00
Tomasz Grabiec	08f68c5f20	test: mvcc: mvcc_container: Avoid UB in tracker() getter when there is no tracker	2023-01-27 19:15:38 +01:00
Tomasz Grabiec	5aa8cb56a8	test: mvcc: Insert entries in the tracker evictable snapshots must have all entries added to the tracker. Partition version merging assumes this. Before this was benign, but will start to trigger asserts in mutation_partition_v2.	2023-01-27 19:15:38 +01:00
Tomasz Grabiec	9d38997971	test: mvcc_test: Do not set dummy::no on non-clustering rows This will trigger an assert in apply_monotonically() later in the series, where this row would be merged with a dummy at the same position. This row must not be marked as non-dummy, there is an assumption that non-clustering positions are all dummies. There can't be two entries with the same position an a different dummy status.	2023-01-27 19:15:38 +01:00
Tomasz Grabiec	f79072638d	mutation_partition: Print full position in error report in append_clustered_row() std::prev(i) can be dummy.	2023-01-27 19:15:38 +01:00
Tomasz Grabiec	6a305666a4	db: mutation_cleaner: Extract make_region_space_guard() Will be used in more places.	2023-01-27 19:15:38 +01:00
Tomasz Grabiec	833e2a8d30	position_in_partition: Optimize equality check We can avoid key comparsion if bound weights don't match.	2023-01-27 19:15:38 +01:00
Tomasz Grabiec	95b509afcd	mvcc: Fix version merging state resetting Upon entry to merge_partition_versions() we skip over versions which are not referenced in order to start merging from the oldest unreferenced version, which is good for performance. Later, we reallocate version merging state if we detected such a move, so that we don't reuse state allocated for a different version pair than before. This check was using version_no, the counter of skipped versions to detect this. But this only makes sense if each merge_partition_versions() uses the same version pointer as a base. In fact it doesn't, if we skip, we advance _version, so the skip is persisted in the snapshot. It's enough to discard the version merging state when we do that. This shouldn't have effect on existing code base, since there is currently no way to trigger the version skipping loggic.	2023-01-27 19:15:38 +01:00
Tomasz Grabiec	1c4b5b0b6b	mutation_partition: apply_resume: Mark operator bool() as explicit	2023-01-27 19:15:38 +01:00
Anna Stuchlik	70480184ab	doc: add the link to the FAQ about pinning to the patch upgrade guides 2022 and 2022	2023-01-27 18:06:54 +01:00
Anna Stuchlik	31515f7604	doc: add a FAQ with a workaround to install a non-latest ScyllaDB version on Debian and Ubuntu	2023-01-27 17:49:00 +01:00
Botond Dénes	84a69b6adb	db/view/view_update_check: check_needs_view_update_path(): filter out non-member hosts We currently don't clean up the system_distributed.view_build_status table after removed nodes. This can cause false-positive check for whether view update generation is needed for streaming. The proper fix is to clean up this table, but that will be more involved, it even when done, it might not be immediate. So until then and to be on the safe side, filter out entries belonging to unknown hosts from said table. Fixes: #11905 Refs: #11836 Closes #11860	2023-01-27 17:12:45 +03:00
Botond Dénes	e2c9cdb576	mutation_compactor: only pass consumed range-tombstone-change to validator Currently all consumed range tombstone changes are unconditionally forwarded to the validator. Even if they are shadowed by a higher level tombstone and/or purgable. This can result in a situation where a range tombstone change was seen by the validator but not passed to the consumer. The validator expects the range tombstone change to be closed by end-of-partition but the end fragment won't come as the tombstone was dropped, resulting in a false-positive validation failure. Fix by only passing tombstones to the validator, that are actually passed to the consumer too. Fixes: #12575 Closes #12578	2023-01-27 14:03:45 +01:00
Nadav Har'El	b99b83acdd	docs/alternator: fix links to open issues The docs/alternator/compatibility.md file links to various open issues on unimplemented features. One of the links was to an already-closed issue. Replace it by a link to an open issue that was missing. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12649	2023-01-27 14:29:57 +02:00
Pavel Emelyanov	1f9f819c8c	table: Remove unused column_family_directory() overload There's another one that accepts explicit basedir first argument and that's used by the rest of the code. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #12643	2023-01-27 14:17:41 +02:00
Nadav Har'El	f873884b50	test/alternator: unskip test which works on modern Scylla We had one test test_gsi.py::test_gsi_identical that didn't work on KA/LA sstables due to #6157, so it was skipped. Today, Scylla no longer supports writing these old sstable formats, so the test can never find itself running on these versions, so should pass. And indeed it does, and the "skip" marker can be removed. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12651	2023-01-27 14:10:07 +02:00
Botond Dénes	d358d4d9e9	Merge 'Configure sstable_test_env with tempdir' from Pavel Emelyanov Today's sstable_test_env starts with a default-configured db::config and, thus, sstables_manager. Test cases that run in this env always create a tempdir to store sstable files in on their own. Next patching makes sstable-manager and friends fully control the data-dir path in order to support object storage for sstables in a nice way, and this behavior of tests upsets this ongoing work. Said that, this PR configures sstable_test_env with a tempdir and pins down the cases using it to stick to that directory, rather than to the custom one. Closes #12641 * github.com:scylladb/scylladb: test: Use tempdir from sstable_test_env test: Add tmpdir to sstable test env test: Keep db::config as unique pointer	2023-01-27 13:59:12 +02:00
Avi Kivity	df09bf2670	tools: toolchain: dbuild: pass NOFILE limit from host to container The leak sanitizer has a bug [1] where, if it detects a leak, it forks something, and before that, it closes all files (instead of using close_range like a good citizen). Docker tends to create containers with the NOFILE limit (number of open files) set to 1 billion. The resulting 1 billion close() system calls is incredibly slow. Work around that problem by passing the host NOFILE limit. [1] https://github.com/llvm/llvm-project/issues/59112 Closes #12638	2023-01-27 13:56:35 +02:00
Benny Halevy	d2893f93cb	view: row_lock: lock_ck: try_emplace row_lock entry Use same method as the two-level lock at the partition level. try_emplace will either use an existing entry, if found, or create a new entry otherwise. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-27 13:51:48 +02:00
Benny Halevy	4b5e324ecb	view: row_lock: lock_ck: find or construct row_lock under partition lock Since we're potentially searching the row_lock in parallel to acquiring the read_lock on the partition, we're racing with row_locker::unlock that may erase the _row_locks entry for the same clustering key, since there is no lock to protect it up until the partition lock has been acquired and the lock_partition future is resolved. This change moves the code to search for or allocate the row lock _after_ the partition lock has been acquired to make sure we're synchronously starting the read/write lock function on it, without yielding, to prevent this use-after-free. This adds an allocation for copying the clustering key in advance even if a row_lock entry already exists, that wasn't needed before. It only us slows down (a bit) when there is contention and the lock already existed when we want to go locking. In the fast path there is no contention and then the code already had to create the lock and copy the key. In any case, the penalty of copying the key once is tiny compared to the rest of the work that view updates are doing. This is required on top of `5007ded2c1` as seen in https://github.com/scylladb/scylladb/issues/12632 which is closely related to #12168 but demonstrates a different race causing use-after-free. Fixes #12632 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-27 13:51:46 +02:00
Kamil Braun	fa9cf81af2	test: topology: verify that group 0 and token ring are consistent After topology changes like removing a node, verify that the set of group 0 members and token ring members is the same. Modify `get_token_ring_host_ids` to only return NORMAL members. The previous version which used the `/storage_service/host_id` endpoint might have returned non-NORMAL members as well. Fixes: #12153 Closes #12619	2023-01-27 14:21:14 +03:00
Avi Kivity	f719de3357	Update seastar submodule * seastar d41af8b59...943c09f86 (20): > reactor: disable io_uring on older kernels if not enough lockable memory is available > demos/tcp_sctp_client_demo: use user-defined literal for sizes > core/units: add user-defined literal for IEC prefixes > core/units: include what we use > coroutine/exception: do not include core/coroutine.hh > seastar/coroutine: drop std-coroutine.hh > core/bitops.hh: add type constraits to templates > apps/iotune: s/condition == false/!condition/ > core/metrics_api: s/promehteus/prometheus/ > reactor: make io_uring the default backend if available > tests: connect_test: use 127.0.0.1 for connect refused test > reactor: use aio to implement reactor_backend_uring::read() > future: schedule: get_available_state_ref under SEASTAR_DEBUG > rpc: client_info: add retrieve_auxiliary_opt > Merge 'Make http requests with content-length header and generated body' from Pavel Emelyanov > Merge 'Ensure logger doesn't allocate' from Travis Downs > http, httpd: optimize header field assignment > sstring: operator<< std::unordered_map: delete stray space char > Dump memory diagnostics at error level on abort > Fix CLI help for memory diagnostics dump Closes #12650	2023-01-26 22:19:24 +02:00
Anna Stuchlik	6ef33f8aae	doc: reorganize the content on the Upgrade ScyllaDB page	2023-01-26 13:37:27 +01:00
Botond Dénes	d7ed92bb42	Merge 'Reduce the number of table::make_sstable() overloads' from Pavel Emelyanov There are several helpers to make an sstable for the table and two with most of the arguments are only used by tests. This PR leaves table with just one arg-less call thus making it easier to patch further. Closes #12636 * github.com:scylladb/scylladb: table: Shrink sstables making API tests: Use sstables manager to make sstables distributed_loader: Add helpers to make sstables for reshape/reshard	2023-01-26 14:25:21 +02:00
Anna Stuchlik	29536cb064	doc: improve the overview of the upgrade procedure (apply feedback)	2023-01-26 13:09:08 +01:00
Kamil Braun	5eadea301e	Merge 'pytest: start after ungraceful stop' from Alecco If a server is stopped suddenly (i.e. not graceful), schema tables might be in inconsistent state. Add a test case and enable Scylla configuration option (force_schema_commit_log) to handle this. Fixes #12218 Closes #12630 * github.com:scylladb/scylladb: pytest: test start after ungraceful stop test.py: enable force_schema_commit_log	2023-01-26 12:08:33 +01:00
Kamil Braun	3eabe04f5d	test/pylib: scylla_cluster: ensure there's space in the cluster pool when running a sequence of tests `ScyllaClusterManager` is used to run a sequence of test cases from a single test file. Between two consecutive tests, if the previous test left the cluster 'dirty', meaning the cluster cannot be reused, it would put the old cluster to the pool with `is_dirty=True`, then get a new cluster from the pool. Between the `put` and the `get`, a concurrent test run (with its own instance of `ScyllaClusterManager`) would start, because there was free space in the pool. This resulted in undesirable behavior when we ran tests with `--repeat X` for a large `X`: we would start with e.g. 4 concurrent runs of a test file, because the pool size was 4. As soon as one of the runs freed up space in the pool, we would start another concurrent run. Soon we'd end up with 8 concurrent runs. Then 16 concurrent runs. And so on. We would have a large number of concurrent runs, even though the original 4 runs didn't finish yet. All of these concurrent runs would compete waiting on the pool, and waiting for space in the pool would take longer and longer (the duration is linear w.r.t number of concurrent competing runs). Tests would then time out because they would have to wait too long. Fix that by using the new `replace_dirty` function introduced to the pool. This function frees up space by returning a dirty cluster and then immediately takes it away to be used for a new cluster. Thanks to this, we will only have at most as many concurrent runs as the pool size. For example with --repeat 8 and pool size 4, we would run 4 concurrent runs and start the 5th run only when one of the original 4 runs finishes, then the 6th run when a second run finishes and so on. Fixes #11757	2023-01-26 11:58:00 +01:00
Kamil Braun	b5ef57ecc2	test/pylib: pool: introduce `replace_dirty` Used to atomically return a dirty object to the pool and then use the space freed by this object to get another object. Unlike `put(is_dirty=True)` followed by `get`, a concurrent waiter cannot take away our space from us. A piece of `get` was refactored to a private function `_build_and_get`, this piece is also used in `replace_dirty`.	2023-01-26 11:58:00 +01:00
Kamil Braun	858803cc2c	test/pylib: pool: replace `steal` with `put(is_dirty=True)` The pool usage was kind of awkward previously: if the user of a pool decided that a previously borrowed object should no longer be used, it was their responsibility to destroy the object (releasing associated resources and so on) and then call `steal()` on the pool to free space for a new object. Change the interface. Now the `Pool` constructor obtains a `destroy` function additionally to the `build` function. The user calls the function `put` to return both objects that are still usable and those aren't. For the latter, they set `is_dirty=True`. The pool will 'destroy' the object with the provided function, which could mean e.g. releasing associated resources. For example, instead of: ``` if self.cluster.is_dirty: self.clusters.stop() self.clusters.release_ips() self.clusters.steal() else: self.clusters.put(self.cluster) ``` we can now use: ``` self.clusters.put(self.cluster, is_dirty=self.cluster.is_dirty) ``` (assuming that `self.clusters` is a pool constructed with a `destroy` function that stops the cluster and releases its IPs.) Also extend the interface of the context manager obtained by `instance()` - the user must now pass a flag `dirty_on_exception`. If the context manager exists due to an exception and that flag was `True`, the object will be considered dirty. The dirty flag can also be set manually on the context manager. For example: ``` async with (cm := pool.instance(dirty_on_exception=True)) as server: cm.dirty = await run_test(test, server) # It will also be considered dirty if run_test throws an exception ```	2023-01-26 11:58:00 +01:00
Pavel Emelyanov	dd307d8a42	test: Use tempdir from sstable_test_env The test cases in sstable_directory_test use a temporary directory that differs from the one sstables manager starts over. Fix that. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-01-26 11:47:06 +03:00
Pavel Emelyanov	0c3799db71	test: Add tmpdir to sstable test env This adds the test/lib's tmpdir instance _and_ configures the data_file_directories with this path. This makes sure sstables manager and the rest of the test use the same directory for sstables. For now it doesn't change anything, but helps next patching. (A neat side effect of this change is that sstable_test_env is now configured the same way as cql_test_env does) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-01-26 11:47:06 +03:00
Pavel Emelyanov	9f4efd6b6f	table: Shrink sstables making API Currently there are four helpers, this patch makes it just two and one of them becomes private the table thus making the API small and neat (and easy to patch further). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-01-26 10:47:39 +03:00
Pavel Emelyanov	fd559f3b81	tests: Use sstables manager to make sstables This test uses two many-args helpers from table calss to create sstables with desired parameters. The table API in question is not used by any other code but these few places, to it's better to open-code it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-01-26 10:47:39 +03:00
Pavel Emelyanov	bfddfb8927	distributed_loader: Add helpers to make sstables for reshape/reshard This kills two birds with one stone. First, it factors out (quite a lot of) common arguments that are passed to table.make_sstable(). Second, it makes the helpers call sstable manager with extended args making it possible to remove those wrappers from table class later. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-01-26 10:47:39 +03:00
Botond Dénes	ba26770376	tools/schema_loader: data_dictionary_impl:try_find_table(): also check ks name Although the number of keyspaces should mostly be 1 here, and thus the chance of two tables from different keyspaces colliding is miniscule, it is not zero. Better be safe than sorry, so match the keyspace name too when looking up a table. Closes #12627	2023-01-25 22:04:07 +02:00
Raphael S. Carvalho	87ee547120	table: Fix quadratic behavior when inserting sstables into tracker on schema change Each time backlog tracker is informed about a new or old sstable, it will recompute the static part of backlog which complexity is proportional to the total number of sstables. On schema change, we're calling backlog_tracker::replace_sstables() for each existing sstable, therefore it produces O(N ^ 2) complexity. Fixes #12499. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #12593	2023-01-25 21:57:33 +02:00
Botond Dénes	bdd4b25c61	scylla-gdb.py: scylla memory: remove 'sstable reads' from semaphore names This phrase is inaccurate and unnecessary. We know all lines in the printout are for reads and they are semaphores: no need to repeat this information on each line. Example: Read Concurrency Semaphores: read: 0/100, 0/ 41901096, queued: 0 streaming: 0/ 10, 0/ 41901096, queued: 0 system: 0/ 10, 0/ 41901096, queued: 0 Closes #12633	2023-01-25 21:55:27 +02:00
Nadav Har'El	f4f2d608d7	dbuild: fix path in example in README The dbuild README has an example how to enable ccache, and required modifying the PATH. Since recently, our docker image includes required commands (cxxbridge) in /usr/local/bin, so the build will fail if that directory isn't also in the path - so add it in the example. Also use the opportunity to fix the "/home/nyh" in one example to "$HOME". Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12631	2023-01-25 21:54:44 +02:00
Pavel Emelyanov	9ccae1be18	test: Keep db::config as unique pointer The goal is to make it possible to make config with custom-initialized options in test_env::impl's constructor initializer list (next patch). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-01-25 19:38:47 +03:00
Kamil Braun	a0ff33e777	test/pylib: scylla_cluster: don't leak server if stopping it fails `ScyllaCluster.server_stop` had this piece of code: ``` server = self.running.pop(server_id) if gracefully: await server.stop_gracefully() else: await server.stop() self.stopped[server_id] = server ``` We observed `stop_gracefully()` failing due to a server hanging during shutdown. We then ended up in a state where neither `self.running` nor `self.stopped` had this server. Later, when releasing the cluster and its IPs, we would release that server's IP - but the server might have still been running (all servers in `self.running` are killed before releasing IPs, but this one wasn't in `self.running`). Fix this by popping the server from `self.running` only after `stop_gracefully`/`stop` finishes. Make an analogous fix in `server_start`: put `server` into `self.running` before we actually start it. If the start fails, the server will be considered "running" even though it isn't necessarily, but that is OK - if it isn't running, then trying to stop it later will simply do nothing; if it is actually running, we will kill it (which we should do) when clearing after the cluster; and we don't leak it. Closes #12613	2023-01-25 16:58:02 +02:00
Alejo Sanchez	878cb45c24	pytest: test start after ungraceful stop Test case for a start of a server after it was stopped suddenly (instead of gracefully). This coud cause commitlog flush issues. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-01-25 14:49:27 +01:00
Alejo Sanchez	ccbd89f0cd	test.py: enable force_schema_commit_log To handle start after ungraceful stop, enable separate schema commit log from server start. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-01-25 14:49:27 +01:00
Kamil Braun	5c886e59de	Merge 'Enable Raft by default in new clusters' from Kamil Braun New clusters that use a fresh conf/scylla.yaml will have `consistent_cluster_management: true`, which will enable Raft, unless the user explicitly turns it off before booting the cluster. People using existing yaml files will continue without Raft, unless consistent_cluster_management is explicitly requested during/after upgrade. Also update the docs: cluster creation and node addition procedures. Fixes #12572. Closes #12585 * github.com:scylladb/scylladb: docs: mention `consistent_cluster_management` for creating cluster and adding node procedures conf: enable `consistent_cluster_management` by default	2023-01-25 14:09:38 +01:00
Benny Halevy	82011fc489	dht: incremental_owned_ranges_checker: belongs_to_current_node: mark as const Its _it member keeps state about the current range. Although it's modified by the method, this is an implementation detail that irrelevant to the caller, hence mark the belongs_to_current_node method as const (and noexcept while at it). This allows the caller, cleanup_compaction, to use it from inside a const method, without having to mark its respective member as mutable too. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #12634	2023-01-25 14:52:21 +02:00
Alexey Novikov	ce96b472d3	prevent populating cache with expired rows from sstables change row purge condition for compacting_reader to remove all expired rows to avoid read perfomance problems when there are many expired tombstones in row cache Refs #2252 Closes #12565	2023-01-25 12:59:40 +01:00
Kamil Braun	5bc7f0732e	Merge 'test.py: manual cluster pool handling for Python suite' from Alecco From reviews of https://github.com/scylladb/scylladb/pull/12569, avoid using `async with` and access the `Pool` of clusters with `get()`/`put()`. Closes #12612 * github.com:scylladb/scylladb: test.py: manual cluster handling for PythonSuite test.py: stop cluster if PythonSuite fails to start test.py: minor fix for failed PythonSuite test	2023-01-24 17:37:55 +01:00
Nadav Har'El	b28818db06	Merge 'Make regexes in types.cc static and remove unnecessary tolower transform' from Marcin Maliszkiewicz - makes all regexes static If making regex compilation static for uuid_type_impl and timeuuid_type_impl helps then it should also help for timestamp_type and simple_date_type. - remove unnecessary tolower transform in simple_date_type_impl::from_sstring Following function uses only decimal and '-' characters (see date_re). They are not affected by tolower call in any way. Aditionally std::strtoll supports "0x" prefixes but also accepts upper case version "0X" so it's also not affected by tolower call. get_simple_date_time only casts strings to integer types using boost:lexical_cast so also not affected by tolower. Finally, serialize only uses str to include it in an exception text so tolower doesn't affect it in a positive way. It's even better that input is displayed to the user as it was, not converted to lower case. Closes #12621 * github.com:scylladb/scylladb: types: remove unnecessary tolower transform in simple_date_type_impl::from_sstring types: make all regexes static	2023-01-24 16:13:59 +02:00
Pavel Emelyanov	f6e8b64334	snitch: Use set_my_dc_and_rack() on all shards Most of snitch drivers set _my_dc and _my_rack with direct assignment thus skipping the sanity checks for dc/rack being empty. On other shards they call set_my_dc_and_rack() helper which warns the empty value and replaces it with some defaults. It's better to use the helper on all shards in order to have the same dc/rack values everywhere. refs: #12185 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #12524	2023-01-24 14:17:06 +02:00
Nadav Har'El	55558e1bd7	test/alternator: check operation on invalid TableName Issue #12538 suggested that maybe Alternator shouldn't bother reporting an invalid table name in item operations like PutItem, and that it's enough to report that the table doesn't exist. But the test added in this patch shows that DynamoDB, like Alternator, reports the invalid table name in this case - not just that the table doesn't exist. That should make us think twice before acting on issue #12538. If we do what this issue recommended, this test will need to be fixed (e.g., to accept as correct both types of errors). Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12608	2023-01-24 14:14:39 +02:00
Kefu Chai	4a0134a097	db: system_keyspace: take the reserved_memory into account before this change, we returns the total memory managed by Seastar in the "total" field in system.memory. but this value only reflect the total memory managed by Seastar's allocator. if `reserve_additional_memory` is set when starting app_template, Seastar's memory subsystem just reserves a chunk of memory of this specified size for system, and takes the remaining memory. since `f05d612da8`, we set this value to 50MB for wasmtime runtime. hence the test of `TestRuntimeInfoTable.test_default_content` in dtest fails. the test expects the size passed via the option of `--memory` to be identical to the value reported by system.memory's "total" field. after this change, the "total" field takes the reserved memory for wasm udf into account. the "total" field should reflect the total size of memory used by Scylla, no matter how we use a certain portion of the allocated memory. Fixes #12522 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12573	2023-01-24 14:07:44 +02:00
Anna Stuchlik	3cbe657b24	doc: fixes https://github.com/scylladb/scylla-docs/issues/3706 , v2 of https://github.com/scylladb/scylladb/pull/11638 , add a note about performance penalty in non-frozen connections vs frozen connections and UDT, add a link to the blog post about performance Closes #12583	2023-01-24 13:16:58 +02:00
Nadav Har'El	158be3604d	test/alternator: xfailing test for huge Limit in ListStreams DynamoDB Streams limits the "Limit" parameter of ListStreams to 100 - anything larger will result in an error. Scylla doesn't necessarily need to uphold the same limit, but we should uphold some limit, as not having any limit can result (in the theoretical case of a huge number of tables with streams enabled) in an unbounded response size. So here we add a test to check that a Limit of 100,000 is not allowed. It passes on DynamoDB (in fact, any number higher than 100 will be enough threre) but fails on Alternator, so is marked "xfail". Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-01-24 12:38:18 +02:00
Nadav Har'El	3beafd8441	alternator/test: un-skip test of zero Limit in ListStreams We had a skipped test on how Alternator handles Limit=0 for ListStreams which should be reported as an error. We had to skip it because boto3 did us a "favor" of discovering this parameter error before ever sending it to the server. We discovered long ago how to avoid this client-side checking in boto3, but only used it for the "dynamodb" fixture and forgot to copy the same trick to the "dynamodbstreams" fixture - and in this patch we do, and can run this test successfully. While at it, also copy the extented timeout configuration we had in the dynamodb fixture also to the dynamodbstreams fixture. There is no reason why it should be different. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2023-01-24 12:38:18 +02:00
Alejo Sanchez	f236d518c6	test.py: manual cluster handling for PythonSuite Instead of complex async with logic, use manual cluster pool handling. Revert the discard() logic in Pool from a recent commit. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-01-24 11:38:17 +01:00
Alejo Sanchez	a6059e4bb7	test.py: stop cluster if PythonSuite fails to start If cluster fails to start, stop it. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-01-24 11:36:49 +01:00
Alejo Sanchez	dec0c1d9f6	test.py: minor fix for failed PythonSuite test Even though test can't fail both before and after, make the logic explicit in case code changes in the future. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-01-24 11:36:49 +01:00
Kefu Chai	232c73a077	doc: add PREVIEW_HOST Make variable add Make variable named `PREVIEW_HOST` so it can be overriden like ``` make preview PREVIEW_HOST=$(hostname -I \| cut -d' ' -f 1) ``` it allows developer to preview the document if the host buiding the document is not localhost. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12589	2023-01-24 12:27:33 +02:00
Botond Dénes	cfaec4428b	Merge 'Remove qctx from system_keyspace::increment_and_get_generation()' from Pavel Emelyanov It's a simple helper used during boot-time that can enjoy query-processor from sharded<system_keyspace> Closes #12587 * github.com:scylladb/scylladb: system_keyspace: De-static system_keyspace::increment_and_get_generation system_keyspace: Fix indentation after previous patch system_keyspace: Coroutinize system_keyspace::increment_and_get_generation	2023-01-24 12:17:12 +02:00
Marcin Maliszkiewicz	f4de64957b	types: remove unnecessary tolower transform in simple_date_type_impl::from_sstring Following function uses only decimal and '-' characters (see date_re). They are not affected by tolower call in any way. Aditionally std::strtoll supports "0x" prefixes but also accepts upper case version "0X" so it's also not affected by tolower call. get_simple_date_time only casts strings to integer types using boost:lexical_cast so also not affected by tolower. Finally, serialize only uses str to include it in an exception text so tolower doesn't affect it in a positive way. It's even better that input is displayed to the user as it was, not converted to lower case.	2023-01-24 10:50:13 +01:00
Calle Wilund	a079c3dbbe	alternator::streams: Special case single table in list_streams Avoid iterating all tables (at least multiple times).	2023-01-24 09:14:33 +00:00
Calle Wilund	9412d8f259	alternator::streams: Only sort tables iff limit < # tables or ExclusiveStartStreamArn set Avoid sorts for request that will be answered immediately.	2023-01-24 08:48:20 +00:00
Avi Kivity	49157370bc	build: don't force-disable io_uring in Seastar The reasons for force-disabling are doubly wrong: we now use liburing from Fedora 37, which is sufficiently recent, and the auto-detection code will disable io_uring if a sufficiently recent version isn't present. Closes #12620	2023-01-24 10:32:00 +02:00
Calle Wilund	9886788a46	alternator::streams: Set default list_streams limit to 100 as per spec AWS docs says so.	2023-01-24 08:24:42 +00:00
Kamil Braun	54170749b8	service/raft: raft_group0: prevent double abort There was a small chance that we called `timeout_src.request_abort()` twice in the `with_timeout` function, first by timeout and then by shutdown. `abort_source` fails on an assertion in this case. Fix this. Fixes: #12512 Closes #12514	2023-01-23 21:32:21 +01:00
Marcin Maliszkiewicz	76c1d0e5d3	types: make all regexes static If making regex compilation static for uuid_type_impl and timeuuid_type_impl helps then it should also help for timestamp_type and simple_date_type.	2023-01-23 20:37:32 +01:00
Nadav Har'El	634c3d81f5	Merge 'doc: add the general upgrade policy' from Anna Stuchlik Fix https://github.com/scylladb/scylla-docs/issues/3968 This PR adds the information that an upgrade to each successive major version is required to upgrade from an old ScyllaDB version. Closes #12586 * github.com:scylladb/scylladb: docs: remove repetition doc: add the general upgrade policy to the uprage page	2023-01-23 18:34:59 +02:00
Benny Halevy	008ca37d28	sstable_directory: reindent reshard Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-23 17:30:05 +02:00
Benny Halevy	792bc58fce	sstable_directory: coroutinize reshard Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-23 17:29:49 +02:00
Nadav Har'El	ccc2c6b5dd	Merge 'test/pylib: scylla_cluster: improve server startup check' from Kamil Braun Don't use a range scan, which is very inefficient, to perform a query for checking CQL availability. Improve logging when waiting for server startup times out. Provide details about the failure: whether we managed to obtain the Host ID of the server and whether we managed to establish a CQL connection. Closes #12588 * github.com:scylladb/scylladb: test/pylib: scylla_cluster: better logging for timeout on server startup test/pylib: scylla_cluster: use less expensive query to check for CQL availability	2023-01-23 17:00:52 +02:00
Kamil Braun	8a1ea6c49f	test/pylib: scylla_cluster: better logging for timeout on server startup Waiting for server startup is a multi-step procedure: after we start the actual process, we will: - try to obtain the Host ID (by querying a REST API endpoint) - then try to connect a CQL session - then try to perform a CQL query The steps are repeated every .1 second until we reach a timeout (the Host ID step is skipped if we previously managed to obtain it). On timeout we'd only get a generic "failed to start server" message, it wouldn't say what we managed to do and what not. For example, on one of the failed jobs on Jenkins I observed this timeout error. Looking at the logs of the server, it turned out that the server printed the "initialization completed" message more than 2 minutes before the actual timeout happened. So for 2 minutes, the test framework either couldn't obtain the Host ID, or couldn't establish a CQL connection, or couldn't perform a CQL query, but I wasn't able to determine fully which one of these was the case. Improve the code by printing whether we managed to get the Host ID of the server and if so - whether we managed to connect to CQL.	2023-01-23 15:59:42 +01:00
Kamil Braun	0e591606a5	test/pylib: scylla_cluster: use less expensive query to check for CQL availability The previous CQL query used a range scan which is very inefficient, even for local tables. Also add a comment explaining why we need this query.	2023-01-23 15:59:05 +01:00
Avi Kivity	3f887fa24b	Merge 'doc: remove duplicatiom of the ScyllaDB ports (table)' from Anna Stuchlik Fix https://github.com/scylladb/scylladb/issues/12605#event-8328930604 This PR removes the duplicated content (the file with the table was included twice) and reorganizes the content in the Networking section. Closes #12615 * github.com:scylladb/scylladb: doc: fix the broken link doc: replace Scylla with ScyllaDB doc: remove duplication in the Networking section (the table of ports used by ScyllaDB	2023-01-23 16:27:06 +02:00
Anna Stuchlik	30f3ee6138	doc: fix the broken link	2023-01-23 14:43:07 +01:00
Anna Stuchlik	1dd0fb8c2d	doc: replace Scylla with ScyllaDB	2023-01-23 14:40:36 +01:00
Anna Stuchlik	d881b3c498	doc: remove duplication in the Networking section (the table of ports used by ScyllaDB	2023-01-23 14:39:01 +01:00
Calle Wilund	da8adb4d26	alterator::streams: Sort tables in list_streams to ensure no duplicates Fixes #12601 (maybe?) Sort the set of tables on ID. This should ensure we never generate duplicates in a paged listing here. Can obviously miss things if they are added between paged calls and end up with a "smaller" UUID/ARN, but that is to be expected.	2023-01-23 11:41:40 +00:00
Benny Halevy	1123565eb0	table: perform_cleanup_compaction: trim owned ranges on compaction_group boundaries To cleanup tokens in sstables that are not owned by the compaction group. This may happen in the future after a compaction group split if copying / linking the sstables in the original compaction_group to the split compaction_groups. Fixes #12594 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-22 22:54:26 +02:00
Benny Halevy	95a8e0b21d	table: make_compaction_groups: calculate compaction_group token ranges Add dht::split_token_range_msb that returns a token_range_vector with ranges split using a given number of most-significant bits. When creating the table's compaction groups, use dht::split_token_range_msb to calculate the token_range owned by each compaction_group. Refs #12594 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-22 22:54:26 +02:00
Benny Halevy	912b56ebcf	dht: range_streamer: define logger as static dht::logger can't be global in this case, as it's too generic, but should be static to range_streamer. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-22 22:54:26 +02:00
Nadav Har'El	54f174a1f4	Merge 'test.py: handle broken clusters for Python suite' from Alecco If the after test check fails (is_after_test_ok is False), discard the cluster and raise exception so context manager (pool) does not recycle it. Ignore exception re-raised by the context manager. Fixes #12360 Closes #12569 * github.com:scylladb/scylladb: test.py: handle broken clusters for Python suite test.py: Pool discard method	2023-01-22 19:58:12 +02:00
Benny Halevy	8009585e7d	table: compaction_group_for_token: use signed arithmetic Add and use dht::compaction_group_of that computes the compaction_group index by unbiasing the token, similar to dht::shard_of. This way, all tokens in `_compaction_groups[i]` are ordered before `_compaction_groups[j]` iff i < j. Fixes #12595 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #12599	2023-01-22 11:27:07 +02:00
Pavel Emelyanov	be2ad2fe99	system_keyspace: De-static system_keyspace::increment_and_get_generation It's only called on cluster-join from storage_service which has the local system_keyspace reference and it's already started by that time. This allows removing few more occurrences of global qctx. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-01-20 17:24:22 +03:00
Pavel Emelyanov	4c4f8aa3e1	system_keyspace: Fix indentation after previous patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-01-20 17:24:22 +03:00
Pavel Emelyanov	b0edc07339	system_keyspace: Coroutinize system_keyspace::increment_and_get_generation Just unroll the fn().then({ fn2().then().then(); }); chain. Indentation is deliberately left broken. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-01-20 17:24:10 +03:00
Botond Dénes	ebc100f74f	types: is_tuple(): handle reverse types Currently reverse types match the default case (false), even though they might be wrapping a tuple type. One user-visible effect of this is that a schema, which has a reversed<frozen<UDT>> clustering key component, will have this component incorrectly represented in the schema cql dump: the UDT will loose the frozen attribute. When attempting to recreate this schema based on the dump, it will fail as the only frozen UDTs are allowed in primary key components. Fixes: #12576 Closes #12579	2023-01-20 15:50:58 +02:00
Anna Stuchlik	0a91578875	docs: remove repetition	2023-01-20 14:45:59 +01:00
Anna Stuchlik	2c357a7007	doc: add the general upgrade policy to the uprage page	2023-01-20 14:43:26 +01:00
Botond Dénes	7f9b39009c	reader_concurrency_semaphore_test: leak test: relax iteration limit This test creates random dummy reads and simulates a query with them. The test works in terms of iteration (tick), advancing each simulating read in each iteration. To prevent infinite runtime an iteration limit of 100 was added to detect a non-converging test and kill it. This limit proved too strict however and in this patch we bump it to 1000 to prevent some unlucky seed making this test fail, as seen recently in CI. Closes #12580	2023-01-20 15:39:13 +02:00
Kamil Braun	050614f34d	docs: mention `consistent_cluster_management` for creating cluster and adding node procedures	2023-01-20 13:29:25 +01:00
Kamil Braun	b0313e670b	conf: enable `consistent_cluster_management` by default Raft will be turned on by default in new clusters. Fixes #12572	2023-01-20 13:29:06 +01:00
Botond Dénes	0d64f327e1	Merge 'gdb: Introduce 'scylla range-tombstones' command' from Tomasz Grabiec Prints and validates range tombstones in a given container. Currently supported containers: - mutation_partition Example: ``` (gdb) scylla range-tombstones $mp { start: ['a', 'b'], kind: bound_kind::excl_start, end: ['a', 'b'], kind: bound_kind::incl_end, t: {timestamp = 1672546889091665, deletion_time = {__d = {__r = 1672546889}}} } { start: ['a', 'b'], kind: bound_kind::excl_start, end: ['a', 'c'] kind: bound_kind::incl_end, t: {timestamp = 1673731764010123, deletion_time = {__d = {__r = 1673731764}}} } ``` Closes #12571 * github.com:scylladb/scylladb: gdb: Introduce 'scylla range-tombstones' gdb: Introduce 'scylla set-schema' gdb: Extract purse_bytes() in managed_bytes_printer	2023-01-20 11:21:34 +02:00
Nadav Har'El	3d78dbd9f2	test/cql-pytest: regression tests for null lookup in local SI We noticed that old branches of Scylla had problems with looking up a null value in a local secondary index - hanging or crashing. This patch includes tests to reproduce these bugs. The tests pass on current master - apparently this bug has already been fixed, but we didn't have a regression test for it. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12570	2023-01-19 23:58:33 +02:00
Alejo Sanchez	51e84508ee	test.py: handle broken clusters for Python suite If the after test check fails (!is_after_test_ok), discard the cluster and raise exception so context manager (pool) does not recycle it. Ignore Pool exception re-raised by the context manager. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-01-19 21:43:50 +01:00
Alejo Sanchez	c886a05b37	test.py: Pool discard method Add a context manager discard() method to tell it to discard the object. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-01-19 21:43:45 +01:00
Avi Kivity	b4d91d87db	Merge 'build: fix build problems in Nix development environment' from Piotr Grabowski This PR fixes three problems that prevented/could prevent a successful build in ScyllaDB's Nix development environment. The first commit adds a missing `abseil-cpp` dependency to Nix devenv, as this dependency is now required after `8635d2442`. The second commit bumps the version of Lua from 5.3 to 5.4, as after `9dd5107919` a 4-argument version of `lua_resume` (only available in Lua 5.4) is used in the ScyllaDB codebase. The third commit explicitly adds `rustc` to Nix devenv dependencies. This places `rustc` from nixpkgs on the `PATH`, preventing `cargo` from executing `rustc` installed globally on the system (see the commit message for additional reasoning). After those changes, ScyllaDB can be succesfully built in both `nix-shell .` and `nix develop .` environments. Closes #12568 * github.com:scylladb/scylladb: build: explicitly add rustc to Nix devenv build: bump Lua version (5.3 -> 5.4) in Nix devenv build: add abseil-cpp dependency to Nix devenv	2023-01-19 21:52:37 +02:00
Tomasz Grabiec	95547162c0	gdb: Introduce 'scylla range-tombstones' Prints and validates range tombstones in a given container. Currently supported containers: - mutation_partition Example: (gdb) scylla range-tombstones $mp { start: ['a', 'b'], kind: bound_kind::excl_start, end: ['a', 'b'], kind: bound_kind::incl_end, t: {timestamp = 1672546889091665, deletion_time = {__d = {__r = 1672546889}}} } { start: ['a', 'b'], kind: bound_kind::excl_start, end: ['a', 'c'] kind: bound_kind::incl_end, t: {timestamp = 1673731764010123, deletion_time = {__d = {__r = 1673731764}}} }	2023-01-19 19:58:13 +01:00
Tomasz Grabiec	f759b35596	gdb: Introduce 'scylla set-schema' Sets the current schema to be used by schema-aware commands. Setting the schema allows some commands and printers to interpret schema-dependent objects and present them in a more friendly form. Some commands require schema to work, for example to sort keys, and will fail otherwise.	2023-01-19 19:58:13 +01:00
Tomasz Grabiec	797bc7915d	gdb: Extract purse_bytes() in managed_bytes_printer	2023-01-19 19:58:13 +01:00
Kamil Braun	2f84e820fd	test/pylib: scylla_cluster: return error details from test framework endpoints If an endpoint handler throws an exception, the details of the exception are not returned to the client. Normally this is desirable so that information is not leaked, but in this test framework we do want to return the details to the client so it can log a useful error message. Do it by wrapping every handler into a catch clause that returns the exception message. Also modify a bit how HTTPErrors are rendered so it's easier to discern the actual body of the error from other details (such as the params used to make the request etc.) Before: ``` E test.pylib.rest_client.HTTPError: HTTP error 500: 500 Internal Server Error E E Server got itself in trouble, params None, json None, uri http+unix://api/cluster/before-test/test_stuff ``` After: ``` E test.pylib.rest_client.HTTPError: HTTP error 500, uri: http+unix://api/cluster/before-test/test_stuff, params: None, json: None, body: E Failed to start server at host 127.155.129.1. E Check the log files: E /home/kbraun/dev/scylladb/testlog/test.py.dev.log E /home/kbraun/dev/scylladb/testlog/dev/scylla-1.log ``` Closes #12563	2023-01-19 17:47:13 +02:00
Kamil Braun	3ed3966f13	test/pylib: scylla_cluster: release cluster IPs when stopping ScyllaClusterManager When we obtained a new cluster for a test case after the previous test case left a dirty cluster, we would release the old cluster's used IP addresses (`_before_test` function). However, we would not release the last cluster's IP after the last test case. We would run out of IPs with sufficiently many test files or `--repeat` runs. Fix this. Also reorder the operations a bit: stop the cluster (and release its IPs) before freeing up space in the cluster pool (i.e. call `self.cluster.stop()` before `self.clusters.steal()`). This reduces concurrency a bit - fewer Scyllas running at the same time, which is good (the pool size gives a limit on the desired max number of concurrently running clusters). Killing a cluster is quick so it won't make a significant difference for the next guy waiting on the pool. Closes #12564	2023-01-19 17:46:46 +02:00
Piotr Grabowski	4068efa173	build: explicitly add rustc to Nix devenv Before this patch, "cargo" was the only Rust toolchain dependency in Nix development environment. Due to the way "cargo" tool is packaged in Nix, "cargo" would first try to use "rustc" from PATH (for example some version already installed globally on OS). If it didn't find any, it would fallback to "rustc" from nixpkgs. There are issues with such approach: - "rustc" installed globally on the system could be old. - the goal of having a Nix development environment is that such environment is separate from the programs installed globally on the system and the versions of all tools are pinned (via flake.lock). Fix this problem by adding rustc to nativeBuildInputs in default.nix. After this patch, "rustc" from nixpkgs is present on the PATH (potentially overriding "rustc" already installed on the system), so "cargo" can correctly use it. You can validate this behavior experimentally by adding a fake failing rustc before entering the Nix development environment: mkdir fakerustc echo '#!/bin/bash' >> fakerustc/rustc echo 'exit 1' >> fakerustc/rustc chmod +x fakerustc/rustc export PATH=$(pwd)/fakerustc:$PATH nix-shell .	2023-01-19 15:53:49 +01:00
Piotr Grabowski	1b8a6b160e	build: bump Lua version (5.3 -> 5.4) in Nix devenv A recent commit (`9dd5107919`) started using a 4-argument version of lua_resume, which is only available in Lua 5.4. This caused build problems when trying to build Scylla in Nix development environment: tools/lua_sstable_consumer.cc:1292:19: error: no matching function for call to 'lua_resume' ret = lua_resume(l, nullptr, nargs, &nresults); ^~~~~~~~~~ /nix/store/wiz3xb19x2pv7j3hf29rbafm4s5zp2kx-lua-5.3.6/include/lua.h:290:15: note: candidate function not viable: requires 3 arguments, but 4 were provided LUA_API int (lua_resume) (lua_State L, lua_State from, int narg); ^ 1 error generated. Fix the problem by bumping the version of Lua from 5.3 to 5.4 in default.nix. Since "lua54Packages.lua" was added to nixpkgs fairly recently (NixOS/nixpkgs#207862), flake.lock is updated to get the newest version of nixpkgs (updated using "nix flake update" command).	2023-01-19 15:53:49 +01:00
Marcin Maliszkiewicz	7230841431	alternator: unify json streaming heuristic Main assumption here is that if is_big is good enough for GetBatchItems operation it should work well also for Scan, Query and GetRecords. And it's easier to maintain more unified code. Additionally 'future<> print' documentation used for streaming suggests that there is quite big overhead so since it seems the only motivation for streaming was to reduce contiguous allocation size below some threshold we should not stream when this threshold is not exceeded. Closes #12164	2023-01-19 16:40:43 +02:00
Anna Stuchlik	20f7848661	docs: add a missing redirection for the Cqlsh page This PR is not related to any reported issue in the repo. I've just discovered a broken link in the university caused by a missing redirection. Closes #12567	2023-01-19 16:37:58 +02:00
Piotr Grabowski	fbc042ff02	build: add abseil-cpp dependency to Nix devenv After `8635d2442` commit, the abseil submodule was removed in favor of using pre-built abseil distribution. Installation of abseil-cpp was added to install-dependencies.sh and dbuild image, but no change was made to the Nix development environment, which resulted in error while executing ./configure.py (while in Nix devenv): Package absl_raw_hash_set was not found in the pkg-config search path. Perhaps you should add the directory containing `absl_raw_hash_set.pc' to the PKG_CONFIG_PATH environment variable No package 'absl_raw_hash_set' found Fix the issue by adding "abseil-cpp" to buildInputs in default.nix.	2023-01-19 15:03:55 +01:00
Nadav Har'El	18be50582d	test/cql-pytest: add tests for behavior of unset values Recently, commit `0b418fa` made the checking for "unset" values more centralized and more robust, but as the tests added in this patch show, the situation is good (and in particular, that #10358 is solved). The tests in this patch check that the behavior of "unset" values in the CQL v4 protocol matches Cassandra's behavior and its documentation, and how it compares to our wishes of how we want unset values to behave. One of these tests fail on Cassandra (we consider this a Cassandra bug). One test fails on Scylla because it doesn't yet support arithmetic expressions (Refs #2693). Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12534	2023-01-19 15:48:07 +02:00
Nadav Har'El	9433108158	Merge 'Allow transient list values to contain NULLs' from Avi Kivity The CQL protocol and specification call for lists with NULLs in some places. For example, the statement: ```cql UPDATE tab SET x = 3 IF y IN (1, 2, NULL) WHERE pk = 4 ``` has a list `(1, 2, NULL)` that contains NULL. Although the syntax is tuple-like, the value is a list; consider the same statement as a prepared statement: ```cql UPDATE tab SET x = :x IF y IN :y_values WHERE pk = :pk ``` `:y_values` must have a list type, since the number of elements is unknown. Currently, this is done with special paths inside LWT that bypass normal evaluation, but if we want to unify those paths, we must allow NULLs in lists (except in storage). This series does that. Closes #12411 * github.com:scylladb/scylladb: test: materialized view: add test exercising synthetic empty-type columns cql3: expr: relax evaluate_list() to allow allow NULL elements types: allow lists with NULL test: relax NULL check test predicate cql3, types: validate listlike collections (sets, lists) for storage types: make empty type deserialize to non-null value	2023-01-19 15:15:16 +02:00
Botond Dénes	d661d03057	Merge 'main, test: integrate perf tools into scylla' from Kefu Chai following tests are integrated into scylla executable - perf_fast_forward - perf_row_cache_update - perf_simple_query - perf_row_cache_update - perf_sstable before this change ```console $ size build/release/scylla text data bss dec hex filename 82284664 288960 335897 82909521 4f11951 build/release/scylla $ ls -l build/release/scylla -rwxrwxr-x 1 kefu kefu 1719672112 Jan 19 17:51 build/release/scylla ``` after this change ```console $ size build/release/scylla text data bss dec hex filename 84349449 289424 345257 84984130 510c142 build/release/scylla $ ls -l build/release/scylla -rwxrwxr-x 1 kefu kefu 1774204800 Jan 19 17:52 build/release/scylla ``` Fixes #12484 Closes #12558 * github.com:scylladb/scylladb: main: move perf_sstable into scylla main: move perf_row_cache_update into scylla test: perf_row_cache_update: add static specifier to local functions main: move perf_fast_forward into scylla main: move perf_simple_query into scylla test: extract debug::the_database out main: shift the args when checking exec_name main: extract lookup_main_func() out	2023-01-19 15:01:30 +02:00
Kamil Braun	147dd73996	test/pylib: scylla_cluster: mark cluster as dirty if it fails to boot If a cluster fails to boot, it saves the exception in `self.start_exception` variable; the exception will be rethrown when a test tries to start using this cluster. As explained in `before_test`: ``` def before_test(self, name) -> None: """Check that the cluster is ready for a test. If there was a start error, throw it here - the server is running when it's added to the pool, which can't be attributed to any specific test, throwing it here would stop a specific test.""" ``` It's arguable whether we should blame some random test for a failure that it didn't cause, but nevertheless, there's a problem here: the `start_exception` will be rethrown and the test will fail, but then the cluster will be simply returned to the pool and the next test will attempt to use it... and so on. Prevent this by marking the cluster as dirty the first time we rethrow the exception. Closes #12560	2023-01-19 14:26:57 +02:00
Marcin Maliszkiewicz	4c33791f96	alternator: eliminate regexes from the hot path This decreases the whole alternator::get_table cpu time by 78% (from 2.8 us to 0.6 us on my cpu). In perf_simple_query it decreases allocs/op by 1.6% (by removing 4 allocations) and increases median tps by 3.4%. Raw results from running: ./build/release/test/perf/perf_simple_query_g --smp 1 \ --alternator forbid --default-log-level error \ --random-seed=1235000092 --duration=180 --write Before the patch: median 46903.65 tps (197.2 allocs/op, 12.1 tasks/op, 170886 insns/op, 0 errors) median absolute deviation: 210.15 maximum: 47354.59 minimum: 42535.63 After the patch: median 48484.76 tps (194.1 allocs/op, 12.1 tasks/op, 168512 insns/op, 0 errors) median absolute deviation: 317.32 maximum: 49247.69 minimum: 44656.38 Closes #12445	2023-01-19 13:23:24 +02:00
Avi Kivity	9029b8dead	test: disable commitlog O_DSYNC, preallocation Commitlog O_DSYNC is intended to make Raft and schema writes durable in the face of power loss. To make O_DSYNC performant, we preallocate the commitlog segments, so that the commitlog writes only change file data and not file metadata (which would require the filesystem to commit its own log). However, in tests, this causes each ScyllaDB instance to write 384MB of commitlog segments. This overloads the disks and slows everything down. Fix this by disabling O_DSYNC (and therefore preallocation) during the tests. They can't survive power loss, and run with --unsafe-bypass-fsync anyway. Closes #12542	2023-01-19 11:14:05 +01:00
Kefu Chai	7f5bb19d1f	main: move perf_sstable into scylla * configure.py: - include `test/perf/perf_sstable` and its dependencies in scylla_perfs * test/perf/perf_sstable.cc: change `main()` to `perf::scylla_sstable_main()` * test/perf/entry_point.hh: add `perf::scylla_sstable_main()` * main.cc: - dispatch "perf-sstable" subcommand to `perf::scylla_sstable_main` before this change, we have a tool at `test/perf/perf_sstable` for running performance tests by exercising sstable related operations. after this change, the `test/perf/perf_sstable` is integreated into `scylla` as a subcommand. so we can run `scylla perf-sstable` [options, ...]` to perform the same tests previous driven by the tool. Fixes #12484 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-01-19 17:42:52 +08:00
Kefu Chai	240f2c6f00	main: move perf_row_cache_update into scylla * configure.py: - include `test/perf/perf_row_cache_update.cc` in scylla_perfs * main.cc: - dispatch "perf-row-cache-update" subcommand to `perf::scylla_row_cache_update_main` * test/perf/perf_fast_forward.cc: change `main()` to `perf::scylla_row_cache_update_main()` * test/perf/entry_point.hh: add `perf::scylla_row_cache_update_main()` before this change, we have a tool at `test/perf/perf_row_cache_update` for running performance tests by updating row cache. after this change, the `test/perf/perf_row_cache_update` is integreated into `scylla` as a subcommand. so we can run `scylla perf-row-cache-update [options, ...]` to perform the same tests previous driven by the tool. Fixes #12484 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-01-19 17:42:46 +08:00
Kefu Chai	4e390b9a05	test: perf_row_cache_update: add static specifier to local functions now that these functions are only used by the same compiling unit, they don't need external linkage. so let's hide them using `static`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-01-19 17:42:46 +08:00
Kefu Chai	228ccdc1c7	main: move perf_fast_forward into scylla * configure.py: - include `test/perf/perf_simple_query.cc` in scylla_perfs * main.cc: - dispatch "perf-fast-forward" subcommand to `perf::scylla_fast_forward_main` * test/perf/perf_fast_forward.cc: change `main()` to `perf::scylla_simple_query_main()` * test/perf/entry_point.hh: add `perf::scylla_simple_query_main()` before this change, we have a tool at `test/perf/perf_fast_forward` for running performance tests by fast forwarding the reader. after this change, the `test/perf/perf_fast_forward` is integreated into `scylla` as a subcommand. so we can run `scylla perf-fast-forward [options, ...]` to perform the same tests previous driven by the tool. Fixes #12484 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-01-19 17:42:40 +08:00
Kefu Chai	09de031cab	main: move perf_simple_query into scylla * configure.py: - include scylla_perfs in scylla - move 'test/lib/debug.cc' down scylla_perfs, as the latter uses `debug::the_database` - link `scylla` against seastar_testing_libs also. because we use the helpers in `test/lib/random_utils.hh` for generating random numbers / sequences in `perf_simple_query.cc`, and `random_utils.hh` references `seastar::testing::local_random_engine` as a local RNG. but `seastar::testing::local_random_engine` is included in `libseastar_testing.a` or `libseastar_perf_testing.a`. since we already have the rules for linking against `libseastar_testing.a`, let's just reuse them, and link `scylla` against this new dependency. * main.cc: - dispatch "perf-simple-query" subcommand to `perf::scylla_simple_query_main` * test/perf/perf_simple_query.cc: change `main()` to `perf::scylla_simple_query_main()` * test/perf/entry_point.hh: define the main function entries so `main.cc` can find them. it's quite like how we collect the entries in `tools/entry_point.hh` before this change, we have a tool at `test/perf/perf_simple_query` for running performance test by sending simple query to a single-node cluster. after this change, the `test/perf/perf_simple_query` is integreated into `scylla` as a subcommand. so we can run `scylla perf-simple-query [options, ...]` to perform the same tests previous driven by the tool. Fixes #12484 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-01-19 17:42:30 +08:00
Kefu Chai	c65692a13a	test: extract debug::the_database out we want to integrate some perf test into scylla executable, so we can run them on a regular basis. but `test/lib/cql_test_env.cc` shares `debug::the_database` with `main.cc`, so we cannot just compile them into a single binary without changing them. before this change, both `test/lib/cql_test_env.cc` and `main.cc` define `debug::the_database`. after this change, `debug::the_database` is extracted into `debug.cc`, so it compiles into a separate compiling unit. and scylla and tests using seastar testing framework are linked against `debug.cc` via `scylla_core` respectively. this paves the road to integrating scylla with the tests linking aginst `test/lib/cql_test_env.cc`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-01-19 17:42:23 +08:00
Nadav Har'El	0ff0c80496	test/cql-pytest: un-xfail tests for UNSET values Commit `0b418fa` improved the error detection of unset values in inappropriate CQL statements, and some of the unit tests translated from Cassandra started to pass, so this patch removes their "xfail" mark. In a couple of places Scylla's error message is worded differently from Cassandra, so the test was modified to look for a shorter string common to both implementations. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12553	2023-01-19 07:47:08 +02:00
Kefu Chai	6a3b19b53d	test/perf: replace "std::cout <<" with fmt::print() for better readablity Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12559	2023-01-19 07:45:13 +02:00
Avi Kivity	aab5954cfb	Merge 'reader_concurrency_semaphore: add more layers of defense against OOM' from Botond Dénes The reader concurrency semaphore has no mechanism to limit the memory consumption of already admitted read. Once memory collective memory consumption of all the admitted reads is above the limit, all it can do is to not admit any more. Sometimes this is not enough and the memory consumption of the already admitted reads balloons to the point of OOMing the node. This pull-request offers a solution to this: it introduces two more layers of defense above this: a soft and a hard limit. Both are multipliers applied on the semaphores normal memory limit. When the soft limit threshold is surpassed, all readers but one are blocked via a new blocking `request_memory()` call which is used by the `tracking_file_impl`. The reader to be allowed to proceed is chosen at random, it is the first reader which happens to request memory after the limit is surpassed. This is both very simple and should avoid situations where the algorithm choosing the reader to be allowed to proceed chooses a reader which will then always time out. When the hard limit threshold is surpassed, `reader_concurrency_semaphore::consume()` starts throwing `std::bad_alloc`. This again will result in eliminating whichever reader was unlucky enough to request memory at the right moment. With this, the semaphore is now effectively enforcing an upper bound for memory consumption, defined by the hard limit. Refs: https://github.com/scylladb/scylladb/issues/11927 Closes #11955 * github.com:scylladb/scylladb: test: reader_concurrency_semaphore_test: add tests for semaphore memory limits reader_permit: expose operator<<(reader_permit::state) reader_permit: add id() accessor reader_concurrency_semaphore: add foreach_permit() reader_concurrency_semaphore: document the new memory limits reader_concurrency_semaphore: add OOM killer reader_concurrency_semaphore: make consume() and signal() private test: stop using reader_concurrency_semaphore::{consume,signal}() directly reader_concurrency_semaphore: move consume() out-of-line reader_permit: consume(): make it exception-safe reader_permit: resource_units::reset(): only call consume() if needed reader_concurrency_semaphore: tracked_file_impl: use request_memory() reader_concurrency_semaphore: add request_memory() reader_concurrency_semaphore: wrap wait list reader_concurrency_semaphore: add {serialize,kill}_limit_multiplier parameters test/boost/reader_concurrency_semaphore_test: dummy_file_impl: don't use hardoced buffer size reader_permit: add make_new_tracked_temporary_buffer() reader_permit: add get_state() accessor reader_permit: resource_units: add constructor for already consumed res reader_permit: resource_units: remove noexcept qualifier from constructor db/config: introduce reader_concurrency_semaphore_{serialize,kill}_limit_multiplier scylla-gdb.py: scylla-memory: extract semaphore stats formatting code scylla-gdb.py: fix spelling of "graphviz"	2023-01-18 17:02:55 +02:00
Avi Kivity	9a54cb5deb	Merge 'cql3/expr: make it possible to prepare binary_operator' from Jan Ciołek `prepare_expression` takes an unprepared CQL expression straight from the parser output and prepares it. Preparation consists of various type checks that are needed to ensure that the expression is correct and to reason about it. While `prepare_expression` supports a number of different types of expressions, until now it was impossible to prepare a `binary_operator`. Eventually we would like to be able to prepare all kinds of expressions, so this PR adds the missing support for `binary_operator`. Closes #12550 * github.com:scylladb/scylladb: expr_test: test preparing binary_operator with NULL RHS expr_test: test preparing IS NOT NULL binary_operator expr_test: test preparing binary_operator with LIKE expr_test: test preparing binary_operator with CONTAINS KEY expr_test: test preparing binary_operator with CONTAINS expr_test: test preparing binary_operator with IN expr_test: test preparing binary_operator with =, !=, <, <=, >, >= expr_test: use make__untyped function in existing tests expr_test_utils: add utilities to create untyped_constant expr_test_utils: add make_float_ and make_double_* cql3: expr: make it possible to prepare binary_operator using prepare_expression cql3/expr: check that RHS of IS NOT NULL is a null value when preparing binary operators cql3: expr: pass non-empty keyspace name in prepare_binary_operator cql3: expr: take reference to schema in prepare_binary_operator	2023-01-18 16:55:18 +02:00
Jenkins Promoter	75a3dd2fc8	release: prepare for 5.3.0-dev	2023-01-18 16:22:41 +02:00
Kefu Chai	965443d6be	main: shift the args when checking exec_name instead of introducing yet another variable for tracking the status, update the args right away. for better readability. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-01-18 22:22:10 +08:00
Kefu Chai	835cd9bfc9	main: extract lookup_main_func() out refactor main() to extract lookup_main_func() out, so we find the main_func in a table instead of using a lengthy if-then-else clause. when the length of the list of candidates of dispatch grows, the code would be less structured. so in this change, the code looking up for the main_func is extracted into a dedicated function for better readability. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-01-18 22:22:10 +08:00
Avi Kivity	71bbd7475c	Update seastar submodule * seastar 8889cbc198...d41af8b592 (14): > Merge 'Perf stall detector related improvements' from Travis Downs Ref #8828, #7882, #11582 (may help make progress) > build: pass HEAPPROF definition to src/core/reactor.cc too > Limit memory address space per core to 64GB when hwloc is not available > build: revert use pkg_search_module(.. IMPORTED_TARGET ..) changes > Fix missing newlines in seastar-addr2line > Use an integral type for uniform_int_distribution > Merge 'tls_test: use a dedicated https server for testing' from Kefu Chai > build: use ${CMAKE_BINARY_DIR} when running 'cmake --build ..' > build: do not set c-ares_FOUND with PARENT_SCOPE > reactor: drop unused member function declaration > sstring: refactor to_sstring() using fmt::format_to() > http: delay input stream close until responses sent > build: enable non-library targets using default option value > Merge 'sstring: specialize uninitialize_string() and use resize_and_overwrite if available' from Kefu Chai Closes #12509	2023-01-18 15:50:57 +02:00
Jan Ciolek	ae0e955b90	expr_test: test preparing binary_operator with NULL RHS Make sure that preparing binary_operator works properly when the RHS is NULL. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-01-18 12:04:46 +01:00
Jan Ciolek	65b8a09409	expr_test: test preparing IS NOT NULL binary_operator Add unit test which check that preparing binary_operators which represent IS NOT NULL works as expected Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-01-18 12:04:46 +01:00
Jan Ciolek	5b3e6769f1	expr_test: test preparing binary_operator with LIKE Add unit test which check that preparing binary_operators with the LIKE operation works as expected. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com	2023-01-18 12:04:45 +01:00
Jan Ciolek	e876496f7f	expr_test: test preparing binary_operator with CONTAINS KEY Add unit test which check that preparing binary_operators with the CONTAINS KEY operation works as expected. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-01-18 12:04:45 +01:00
Jan Ciolek	c6d2e1a03e	expr_test: test preparing binary_operator with CONTAINS Add unit test which check that preparing binary_operators with the CONTAINS operation works as expected. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-01-18 12:04:45 +01:00
Jan Ciolek	6b147ecaea	expr_test: test preparing binary_operator with IN Add unit test which check that preparing binary_operators with the IN operation works as expected. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-01-18 12:04:45 +01:00
Jan Ciolek	669d791250	expr_test: test preparing binary_operator with =, !=, <, <=, >, >= Add unit test which check that preparing binary_operators with basic comparison operations works as expected. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-01-18 12:04:44 +01:00
Jan Ciolek	60803d12a9	expr_test: use make_*_untyped function in existing tests Use the newly introduced convenience methods that create untyped_constant in existing tests. This will make the code more readable by removing visual clutter that came with the previous overly verbose code. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-01-18 12:04:44 +01:00
Jan Ciolek	819390f9fe	expr_test_utils: add utilities to create untyped_constant expression tests often need to create instances of untyped_constant. Creating them by hand is tedious because the required code is overly verbose. Having convenience functions for it speeds up test writing. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-01-18 12:04:44 +01:00
Jan Ciolek	362bf7f534	expr_test_utils: add make_float_* and make_double_* Add utilities to create float and double values in tests. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-01-18 12:04:44 +01:00
Jan Ciolek	da3c07955a	cql3: expr: make it possible to prepare binary_operator using prepare_expression prepare_expression didn't allow to prepare binary_operators. so it's now implemented. If prepare_binary_operator is unable to infer the types it will fail with an exception instead of returning std::nullopt, but we can live with that for now. Preparing binary_operators inside the WHERE clause is currently more complicated than just calling prepare_binary_operator. Preparation of the WHERE clause is done inside statement_restrictions constructor. It's done by iterating over all binary_operators, validating them and then preparing. The validation contains additional checks with custom error messages. Preparation has to be done after validation, because otherwise the error messages will change and some tests will start failing. Because of that we can't just call prepare_expression on the WHERE clause yet. It's still useful to have the ability to prepare binary_operators using prepare_expression. In cases where we know that the WHERE clause is valid, we can just call prepare_expression and be done with it. Once grammar is fully relaxed the artificial constraints checked by the validation code will be removed and it will be possible to prepare the whole WHERE clause using just prepare_expression. prepare_expression does a bit more than prepare_binary_operator. In case where both sides of the binary_operator are known it will evaluate the whole binary_operator to a constant value. Query analysis code is NOT ready to encounter constant boolean values inside the WHERE clause, so for the WHERE we still use prepare_binary_operator which doesn't evaluate the binary_operator to a constant value. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-01-18 12:04:43 +01:00
Jan Ciolek	5f8b1a1a60	cql3/expr: check that RHS of IS NOT NULL is a null value when preparing binary operators When preparing a binary operator we first prepare the LHS, which gives us information about its type and allows to infer the desired type of RHS. Then the RHS is prepared with the expectation that it is compatible with the inferred type. This is enough for all types of operations apart from IS NOT NULL. For IS NOT we should also check that the RHS value is actually null. It's not enough to check that RHS is of right type. Before this change preparing `int_col IS NOT 123` would end in success, which is wrong. The missing check doesn't cause any real problems, it's impossible for the user to produce such input because the parser will reject it. Still it's better to have the check because in the future the grammar might get more relaxed and the parser could become more generic, making it possible to write such things. It would be better to introduce unary_operators, but that's a bigger change. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-01-18 12:04:43 +01:00
Jan Ciolek	703e9f21ff	cql3: expr: pass non-empty keyspace name in prepare_binary_operator For some reason we passed an empty keyspace name to prepare_expression when preparing the LHS of a binary operator. This doesn't look correct. We have keyspace name available from the schema_ptr so let's use that. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-01-18 12:04:43 +01:00
Jan Ciolek	9a0c5789a2	cql3: expr: take reference to schema in prepare_binary_operator prepare_binary_operator takes a schema_ptr, but it would be useful to take a reference to schema instead. Every schema_ptr can be easily converted to a reference so there is no loss of functionality. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2023-01-18 12:04:40 +01:00
Nadav Har'El	48e2d6a541	Merge 'utils: throw error on malformed input in base64 decode' from Marcin Maliszkiewicz Several cases where fixed in this patches, all are related to processing of malformed base64 data. Main purpose was to bring alternator implementation closer to what DynamoDB does. We now: - Throw error when padding is missing during base64 decoding - Throw error when base64 data is malformed - In alternator when invalid base64 data is fetched from DB (as opposed to being part of user's request) we now exclude such row during filtering Additionally some small code quality improvements: - avoid unnecessary type conversions in calls to rjson:from_strings functions - avoid some copy constructions in calls to rjson:from_strings functions Fixes https://github.com/scylladb/scylladb/issues/6487 Closes #11944 * github.com:scylladb/scylladb: alternator: evaluate expressions as false for stored malformed binary data rjson: avoid copy constructors in from_string calls when possible alternator: remove unused parameters from describe_items func utils: throw error on malformed input in base64 decode utils: throw error on missing padding in base64 decode	2023-01-18 12:40:57 +02:00
Avi Kivity	561f4ca057	test: materialized view: add test exercising synthetic empty-type columns Materialized views inject synthetic empty-type columns in some conditions. Since we just touched empty-type serialization/deserialization, add a test to exercise it and make sure it still works.	2023-01-18 10:38:24 +02:00
Avi Kivity	04925a7b29	cql3: expr: relax evaluate_list() to allow allow NULL elements Tests are similarly relaxed. A test is added in lwt_test to show that insertion of a list with NULL is still rejected, though we allow NULLs in IF conditions. One test is changed from a list of longs to a list of ints, to prevent churn in the test helper library.	2023-01-18 10:38:24 +02:00
Avi Kivity	390a0ca47b	types: allow lists with NULL Allow transient lists that contain NULL throughout the evaluation machinery. This makes is possible to evalute things like `IF col IN (1, 2, NULL)` without hacks, once LWT conditions are converted to expressions. A few tests are relaxed to accommodate the new behavior: - cql_query_test's test_null_and_unset_in_collections is relaxed to allow `WHERE col IN ?`, with the variable bound to a list containing NULL; now it's explicitly allowed - expr_test's evaluate_bind_variable_validates_no_null_in_list was checking generic lists for NULLs, and was similary relaxed (and renamed) - expr_Test's evaluate_bind_variable_validates_null_in_lists_recursively was similarly relaxed to allow NULLs.	2023-01-18 10:38:24 +02:00
Avi Kivity	00145f9ada	test: relax NULL check test predicate When we start allowing NULL in lists in some contexts, the exact location where an error is raised (when it's disallowed) will change. To prepare for that, relax the exception check to just ensure the word NULL is there, without caring about the exact wording.	2023-01-18 10:38:24 +02:00
Avi Kivity	5f8540ecfa	cql3, types: validate listlike collections (sets, lists) for storage Lists allow NULL in some contexts (bind variables for LWT "IN ?" conditions), but not in most others. Currently, the implementation just disallows NULLs in list values, and the cases where it is allowed are hacked around. To reduce the special cases, we'll allow lists to have NULLs, and just restrict them for storage. This is similar to how scalar values can be NULL, but not when they are part of a partition key. To prepare for the transition, identify the locations where lists (and sets, which share the same storage) are stored as frozen values and add a NULL check there. Non-frozen lists already have the check. Since sets share the same format as lists, apply the same to them. No actual checks are done yet, since NULLs are impossible. This is just a stub.	2023-01-18 10:38:24 +02:00
Avi Kivity	da4abccf89	types: make empty type deserialize to non-null value The empty type is used internally to implement CQL sets on top of multi-cell maps. The map's key (an atomic cell) represents the set value, and the map's value is discarded. Since it's unneeded we use an internal "empty" type. Currently, it is deserialized into a `data_value` object representing a NULL. Since it's discarded, it really doesn't matter. However, with the impending change to change lists to allow NULLs, it does matter: 1. the coordinator sets the 'collections_as_maps' flag for LWT requests since it wants list indexes (this affects sets too). 2. the replica responds by serializing a set as a map. 3. since we start allow NULL collection values, we now serialize those NULLs as NULLs. 4. the coordinator deserializes the map, and complains about NULL values, since those are not supported. The solution is simple, deserialize the empty value as a non-NULL object. We create an empty empty_type_representation and add the scaffolding needed. Serialization and deserialization is already coded, it was just never called for NULL values (which were serialized with size 0, in collections, rather than size -1, luckily). A unit test is added.	2023-01-18 10:38:24 +02:00
Tomasz Grabiec	563998b69a	Merge 'raft: improve group 0 reconfiguration failure handling' from Kamil Braun Make it so that failures in `removenode`/`decommission` don't lead to reduced availability, and any leftovers in group 0 can be removed by `removenode`: - In `removenode`, make the node a non-voter before removing it from the token ring. This removes the possibility of having a group 0 voting member which doesn't correspond to a token ring member. We can still be left with a non-voter, but that's doesn't reduce the availability of group 0. - As above but for `decommission`. - Make it possible to remove group 0 members that don't correspond to token ring members from group 0 using `removenode`. - Add an API to query the current group 0 configuration. Fixes #11723. Closes #12502 * github.com:scylladb/scylladb: test: test_topology: test for removing garbage group 0 members test/pylib: move some utility functions to util.py db: system_keyspace: add a virtual table with raft configuration db: system_keyspace: improve system.raft_snapshot_config schema service: storage_service: better error handling in `decommission` service: storage_service: fix indentation in removenode service: storage_service: make `removenode` work for group 0 members which are not token ring members service/raft: raft_group0: perform read_barrier in wait_for_raft service: storage_service: make leaving node a non-voter before removing it from group 0 in decommission/removenode test: test_raft_upgrade: remove test_raft_upgrade_with_node_remove service/raft: raft_group0: link to Raft docs where appropriate service/raft: raft_group0: more logging service/raft: raft_group0: separate function for checking and waiting for Raft	2023-01-17 21:23:15 +01:00
Kamil Braun	d134c458e5	test/pylib: increase timeout when waiting for cluster before test Increase the timeout from default 5 minutes to 10 minutes. Sent as a workaround for #12546 to unblock next promotions. Closes #12547	2023-01-17 21:03:09 +02:00
Kamil Braun	4f1c317bdc	test: test_raft_upgrade: stop servers gracefully in test_recovery_after_majority_loss This test is frequently failing due to a timeout when we try to restart one of the nodes. The shutdown procedure apparently hangs when we try to stop the `hints_manager` service, e.g.: ``` INFO 2023-01-13 03:18:02,946 [shard 0] hints_manager - Asked to stop INFO 2023-01-13 03:18:02,946 [shard 0] hints_manager - Stopped INFO 2023-01-13 03:18:02,946 [shard 0] hints_manager - Asked to stop INFO 2023-01-13 03:18:02,946 [shard 1] hints_manager - Asked to stop INFO 2023-01-13 03:18:02,946 [shard 1] hints_manager - Stopped INFO 2023-01-13 03:18:02,946 [shard 1] hints_manager - Asked to stop INFO 2023-01-13 03:18:02,946 [shard 1] hints_manager - Stopped INFO 2023-01-13 03:22:56,997 [shard 0] hints_manager - Stopped ``` observe the 5 minute delay at the end. There is a known issue about `hints_manager` stop hanging: #8079. Now, for some reason, this is the only test case that is hitting this issue. We don't completely understand why. There is one significant difference between this test case and others: this is the only test case which kills 2 (out of 3) servers in the cluster and then tries to gracefully shutdown the last server. There's a hypothesis that the last server gets stuck trying to send hints to the killed servers. We weren't able to prove/falsify it yet. But if it's true, then this patch will: - unblock next promotions, - give us some important information when we see that the issue stops appearing. In the patch we shutdown all servers gracefully instead of killing them, like we do in the other test cases. Closes #12548	2023-01-17 20:51:09 +02:00
Pavel Emelyanov	4f415413d2	raft: Fix non-existing state_machine::apply_entry in docs The docs mention that method, but it doesn't exist. Instead, the state_machine interface defines plain .apply() one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #12541	2023-01-17 12:53:05 +01:00
Kamil Braun	5545547d07	test: test_topology: test for removing garbage group 0 members Verify that `removenode` can remove group 0 members which are not token ring members.	2023-01-17 12:28:00 +01:00
Kamil Braun	c959ec455a	test/pylib: move some utility functions to util.py They were used in test_raft_upgrade, but we want to use them in other test files too.	2023-01-17 12:28:00 +01:00
Kamil Braun	a483915c62	db: system_keyspace: add a virtual table with raft configuration Add a new virtual table `system.raft_state` that shows the currently operating Raft configuration for each present group. The schema is the same as `system.raft_snapshot_config` (the latter shows the config from the last snapshot). In the future we plan to add more columns to this table, showing more information (like the current leader and term), hence the generic name. Adding the table requires some plumbing of `sharded<raft_group_registry>&` through function parameters to make it accessible from `register_virtual_tables`, but it's mostly straightforward. Also added some APIs to `raft_group_registry` to list all groups and find a given group (returning `nullptr` if one isn't found, not throwing an exception).	2023-01-17 12:28:00 +01:00
Kamil Braun	2bfe85ce9b	db: system_keyspace: improve system.raft_snapshot_config schema Remove the `ip_addr` column which was not used. IP addresses are not part of Raft configuration now and they can change dynamically. Swap the `server_id` and `disposition` columns in the clustering key, so when querying the configuration, we first obtain all servers with the current disposition and then all servers with the previous disposition (note that a server may appear both in current and previous).	2023-01-17 12:28:00 +01:00
Kamil Braun	c3ed82e5fb	service: storage_service: better error handling in `decommission` Improve the error handling in `decommission` in case `leave_group0` fails, informing the user what they should do (i.e. call `removenode` to get rid of the group 0 member), and allowing decommission to finish; it does not make sense to let the node continue to run after it leaves the token ring. (And I'm guessing it's also not safe. Or maybe impossible.)	2023-01-17 12:28:00 +01:00
Kamil Braun	beb0eee007	service: storage_service: fix indentation in removenode	2023-01-17 12:28:00 +01:00
Kamil Braun	aba33dd352	service: storage_service: make `removenode` work for group 0 members which are not token ring members Due to failures we might end up in a situation where we have a group 0 member which is not a token ring member: a decommission/removenode which failed after leaving/removing a node from the token ring but before leaving / removing a node from group 0. There was no way to get rid of such a group 0 member. A node that left the token ring must not be allowed to run further (or it can cause data loss, data resurrection and maybe other fun stuff), so we can't run decommission a second time (even if we tried, it would just say that "we're not a member of the token ring" and abort). And `removenode` would also not work, because it proceeds only if the node requested to be removed is a member of the token ring. We modify `removenode` so it can run in this situation and remove the group 0 member. The parts of `removenode` related to token ring modification are now conditioned on whether the node was a member of the token ring. The final `remove_from_group0` step is in its own branch. Some minor refactors were necessary. Some log messages were also modified so it's easier to understand which messages correspond the "token movement" part of the procedure. The `make_nonvoter` step happens only if token ring removal happens, otherwise we can skip directly to `remove_from_group0`. We also move `remove_from_group0` outside the "try...catch", fixing #11723. The "node ops" part of the procedure is related strictly to token ring movement, so it makes sense for `remove_from_group0` to happen outside. Indentation is broken in this commit for easier reviewability, fixed in the following commit. Fixes: #11723	2023-01-17 12:28:00 +01:00
Kamil Braun	ec2cd29e42	service/raft: raft_group0: perform read_barrier in wait_for_raft Right now wait_for_raft is called before performing group 0 configuration changes. We want to also call it before checking for membership, for that it's desirable to have the most recent information, hence call read_barrier. In the existing use cases it's not strictly necessary, but it doesn't hurt.	2023-01-17 12:28:00 +01:00
Kamil Braun	db734cd74f	service: storage_service: make leaving node a non-voter before removing it from group 0 in decommission/removenode removenode currently works roughly like this: 1. stream/repair data so it ends up on new replica sets (calculated without the node we want to remove) 2. remove the node from the token ring 3. remove the node from group 0 configuration. If the procedure fails before after step 2 but before step 3 finishes, we're in trouble: the cluster is left with an additional voting group 0 member, which reduces group 0's availability, and there is no way to remove this member because `removenode` no longer considers it to be part of the cluster (it consults the token ring to decide). Improve this failure scenario by including a new step at the beginning: make the node a non-voter in group 0 configuration. Then, even if we fail after removing the node from the token ring but before removing it from group 0, we'll only be left with a non-voter which doesn't reduce availability. We make a similar change for `decommission`: between `unbootstrap()` (which streams data) and `leave_ring()` (which removes our tokens from the ring), become a non-voter. The difference here is that we don't become a non-voter at the beginning, but only after streaming/repair. In `removenode` it's desirable to make the node a non-voter as soon as possible because it's already dead. In decommission it may be desirable for us to remain a voter if we fail during streaming because we're still alive and functional in that case. In a later commit we'll also make it possible to retry `removenode` to remove a node that is only a group 0 member and not a token ring member.	2023-01-17 12:28:00 +01:00
Kamil Braun	1eee349a17	test: test_raft_upgrade: remove test_raft_upgrade_with_node_remove The test would create a scenario where one node was down while the others started the Raft upgrade procedure. The procedure would get stuck, but it was possible to `removenode` the downed node using one of the alive nodes, which would unblock the Raft upgrade procedure. This worked because: 1. the upgrade procedure starts by ensuring that all peers can be contacted, 2. `removenode` starts by removing the node from the token ring. After removing the node from the token ring, the upgrade procedure becomes able to contact all peers (the peers set no longer contains the down node). At the end, after removing the node from the token ring, `removenode` would actually get stuck for a while, waiting for the upgrade procedure to finish before removing the peer from group 0. After the upgrade procedure finished, `removenode` would also finish. (so: first the upgrade procedure waited for removenode, then removenode waited for the upgrade procedure). We want to modify the `removenode` procedure and include a new step before removing the node from the token ring: making the node a non-voter. The purpose is to improve the possible failure scenarios. Previously, if the `removenode` procedure failed after removing the node from the token ring but before removing it from group 0, the cluster would contain a 'garbage' group 0 member which is a voter - reducing group 0's availability. If the node is made a non-voter first, then this failure will not be as big of a problem, because the leftover group 0 member will be a non-voter. However, to correctly perform group 0 operations including making someone a nonvoter, we must first wait for the Raft upgrade procedure to finish (or at least wait until everyone joins group 0). Therefore by including this 'make the node a non-voter' step at the beginning of `removenode`, we make it impossible to remove a token ring member in the middle of the upgrade procedure, on which the test case relied. The test case would get stuck waiting for the `removenode` operation to finish, which would never finish because it would wait for the upgrade procedure to finish, which would not finish because of the dead peer. We remove the test case; it was "lucky" to pass in the first place. We have a dedicated mechanism for handling dead peers during Raft upgrade procedure: the manual Raft group 0 RECOVERY procedure. There are other test cases in this file which are using that procedure.	2023-01-17 12:28:00 +01:00
Kamil Braun	4f0801406e	service/raft: raft_group0: link to Raft docs where appropriate Resolve some TODOs.	2023-01-17 12:28:00 +01:00
Kamil Braun	2befbaa341	service/raft: raft_group0: more logging Make the logs in leave_group0 consistent with logs in remove_from_group0.	2023-01-17 12:28:00 +01:00
Kamil Braun	77dc1c4c70	service/raft: raft_group0: separate function for checking and waiting for Raft leave_group0 and remove_from_group0 functions both start with the following steps: - if Raft is disabled or in RECOVERY mode, print a simple log message and abort - if Raft cluster feature flag is not yet enabled, print a complex log message and abort - wait for Raft upgrade procedure to finish - then perform the actual group 0 reconfiguration. Refactor these preparation steps to a separate function, `wait_for_raft`. This reduces code duplication; the function will also be used in more operations later (becoming a nonvoter or turning another server into a nonvoter). We also change the API so that the preparation function is called from outside by the caller before they call the reconfiguration function. This is because in later commits, some of the call sites (mainly `removenode`) will want to check explicitly whether Raft is enabled and wait for Raft's availabilty, then perform a sequence of steps related to group 0 configuration depending on the result. Also add a private function `raft_upgrade_complete()` which we use to assert that Raft is ready to be used.	2023-01-17 12:27:58 +01:00
Wojciech Mitros	5f45b32bfa	forward_service: prevent heap use-after-free of forward_aggregates Currently, we create `forward_aggregates` inside a function that returns the result of a future lambda that captures these aggregates by reference. As a result, the aggregates may be destructed before the lambda finishes, resulting in a heap use-after-free. To prolong the lifetime of these aggregates, we cannot use a move capture, because the lambda is wrapped in a with_thread_if_needed() call on these aggregates. Instead, we fix this by wrapping the entire return statement in a do_with(). Fixes #12528 Closes #12533	2023-01-17 13:25:57 +02:00
Botond Dénes	8ea128cc27	test: reader_concurrency_semaphore_test: add tests for semaphore memory limits	2023-01-17 05:27:04 -05:00
Botond Dénes	ec1c615029	reader_permit: expose operator<<(reader_permit::state)	2023-01-17 05:27:04 -05:00
Botond Dénes	78583b84f1	reader_permit: add id() accessor Effectively returns the address of the underlying permit impl as an `uintptr_t`. This can be used to determine the identity of the permit.	2023-01-17 05:27:04 -05:00
Botond Dénes	7f8469db27	reader_concurrency_semaphore: add foreach_permit() Allows iterating over all permits.	2023-01-17 05:27:04 -05:00
Botond Dénes	4c70b58993	reader_concurrency_semaphore: document the new memory limits	2023-01-17 05:27:04 -05:00
Botond Dénes	edb32cb171	reader_concurrency_semaphore: add OOM killer When the collective memory consumption of all readers goes above $kill_limit_multiplier * $memory_limit, consume() will throw std::bad_alloc(), instantly unwinding the read that is unlucky enough to have requested the last bytes of memory. This should help situation where there are some problematic partitions, either because of large cells or because they are scattered in too many sstables. Currently nothing prevents such reads from bringing down the entire node via OOM.	2023-01-17 05:27:04 -05:00
Botond Dénes	81e2a2be7d	reader_concurrency_semaphore: make consume() and signal() private Using this API is quite dangerous as any mistakes can lead to leaking resources from the semaphore. Also, soon we will tie this API closer to permits, so they won't be as generic. Make them private so we don't have to worry about correct usage. All external users are patched away already.	2023-01-17 05:27:04 -05:00
Botond Dénes	ab18e7b178	test: stop using reader_concurrency_semaphore::{consume,signal}() directly These methods will soon be retired (made private) so migrate away from them. Consume memory through a permit instead. It is also safer this way: all memory consumed through the permit is guaranteed to be released when the permit is destroyed at the latest.	2023-01-17 05:27:04 -05:00
Botond Dénes	8f9e8aafdf	reader_concurrency_semaphore: move consume() out-of-line Its about to get a little bit more complex.	2023-01-17 05:27:04 -05:00
Botond Dénes	e4ef28284b	reader_permit: consume(): make it exception-safe reader_concurrency_semaphroe::consume() will soon throw.	2023-01-17 05:27:04 -05:00
Botond Dénes	029269af42	reader_permit: resource_units::reset(): only call consume() if needed reset() is called from the destructor, with null resources. Calling consume() can be avoided in this case and in fact it is required as consume() is soon going to throw in some cases.	2023-01-17 05:27:04 -05:00
Botond Dénes	dd9a0a16e6	reader_concurrency_semaphore: tracked_file_impl: use request_memory() Use the recently added `request_memory()` to aquire the memory units for the I/O. This allows blocking all but one readers when memory consumption grows too high.	2023-01-17 05:27:04 -05:00
Botond Dénes	9ed5d861be	reader_concurrency_semaphore: add request_memory() A possibly blocking request for more memory. If the collective memory consumption of all reads goes above $serialize_limit_multiplier * $memory_limit this request will block for all but one reader (the first requester). Until this situation is resolved, that is until memory stays above the above explained limit, only this one reader is allowed to make progress. This should help reign in the memory consumption of reads in a situation where their memory consumption used to baloon without constraints before.	2023-01-17 05:27:04 -05:00
Gleb Natapov' via ScyllaDB development	15ebd59071	lwt: upgrade stored mutations to the latest schema during prepare Currently they are upgraded during learn on a replica. The are two problems with this. First the column mapping may not exist on a replica if it missed this particular schema (because it was down for instance) and the mapping history is not part of the schema. In this case "Failed to look up column mapping for schema version" will be thrown. Second lwt request coordinator may not have the schema for the mutation as well (because it was freed from the registry already) and when a replica tries to retrieve the schema from the coordinator the retrieval will fail causing the whole request to fail with "Schema version XXXX not found" Both of those problems can be fixed by upgrading stored mutations during prepare on a node it is stored at. To upgrade the mutation its column mapping is needed and it is guarantied that it will be present at the node the mutation is stored at since it is pre-request to store it that the corresponded schema is available. After that the mutation is processed using latest schema that will be available on all nodes. Fixes #10770 Message-Id: <Y7/ifraPJghCWTsq@scylladb.com>	2023-01-17 11:14:46 +01:00
Raphael S. Carvalho	f2f839b9cc	compaction: LCS: don't reshape all levels if only a single breaks disjointness LCS reshape is compacting all levels if a single one breaks disjointness. That's unnecessary work because rewriting that single level is enough to restore disjointness. If multiple levels break disjointness, they'll each be reshaped in its own iteration, so reducing operation time for each step and disk space requirement, as input files can be released incrementally. Incremental compaction is not applied to reshape yet, so we need to avoid "major compaction", to avoid the space overhead. But space overhead is not the only problem, the inefficiency, when deciding what to reshape when overlapping is detected, motivated this patch. Fixes #12495. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #12496	2023-01-17 09:55:15 +02:00
Michał Chojnowski	9e17564c70	types: add some missing explicit instantiations Some functions defined by a template in types.cc are used in other translation units (via `cql3/untyped_result_set.hh`), but aren't explicitly instantiated. Therefore their linking can fail, depending on inlining decisions. (I experienced this when playing with compiler options). Fix that. Closes #12539	2023-01-17 10:46:01 +02:00
Nadav Har'El	5bf94ae220	cql: allow disabling of USING TIMESTAMP sanity checking As requested by issue #5619, commit `2150c0f7a2` added a sanity check for USING TIMESTAMP - the number specified in the timestamp must not be more than 3 days into the future (when viewed as a number of microseconds since the epoch). This sanity checking helps avoid some annoying client-side bugs and mis-configurations, but some users genuinely want to use arbitrary or futuristic-looking timestamps and are hindered by this sanity check (which Cassandra doesn't have, by the way). So in this patch we add a new configuration option, restrict_future_timestamp If set to "true", futuristic timestamps (more than 3 days into the future) are forbidden. The "true" setting is the default (as has been the case sinced #5619). Setting this option to "false" will allow using any 64-bit integer as a timestamp, like is allowed Cassanda (and was allowed in Scylla prior to #5619. The error message in the case where a futuristic timestamp is rejected now mentions the configuration paramter that can be used to disable this check (this, and the option's name "restrict_*", is similar to other so-called "safe mode" options). This patch also includes a test, which works in Scylla and Cassandra, with either setting of restrict_future_timestamp, checking the right thing in all these cases (the futuristic timestamp can either be written and read, or can't be written). I used this test to manually verify that the new option works, defaults to "true", and when set to "false" Scylla behaves like Cassandra. Fixes #12527 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12537	2023-01-16 23:18:56 +02:00
Kefu Chai	114f30016a	main: use std::shift_left() to consume tool name for better readability. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12536	2023-01-16 21:01:34 +02:00
Nadav Har'El	feef3f9dda	test/cql-pytest: test more than one restriction on same clustering column Cassandra refuses a request with more than one relation to the same clustering column, for example DELETE FROM tbl WHERE p = ? and c = ? AND c > ? complains that c cannot be restricted by more than one relation if it includes an Equal But it produces different error messages for different operators and even order. Currently, Scylla doesn't consider such requests an error. Whether or not we should be compatible with Cassandra here is discussed in issue #12472. But as long as we do accept these queries, we should be sure we do the right thing: "WHERE c = 1 AND c > 2" should match nothing, "WHERE c = 1 AND c > 0" should match the matches of c = 1, and so on. This patch adds a test for verify that these requests indeed yield correct results. The test is scylla_only because, as explained above, Cassandra doesn't support these requests at all. Refs #12472 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12498	2023-01-16 20:41:16 +02:00
Kefu Chai	86b451d45c	SCYLLA-VERSION-GEN: remove unnecessary bashism remove unnecessary bashism, so that this script can be interpreted by a POSIX shell. /bin/sh is specified in the shebang line. on debian derivatives, /bin/sh is dash, which is POSIX compliant. but this script is written in the bash dialect. before this change, we could run into following build failure when building the tree on Debian: [7/904] ./SCYLLA-VERSION-GEN ./SCYLLA-VERSION-GEN: 37: [[: not found after this change, the build is able to proceed. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12530	2023-01-16 20:34:01 +02:00
Avi Kivity	0b418fa7cf	cql3, transport, tests: remove "unset" from value type system The CQL binary protocol introduced "unset" values in version 4 of the protocol. Unset values can be bound to variables, which cause certain CQL fragments to be skipped. For example, the fragment `SET a = :var` will not change the value of `a` if `:var` is bound to an unset value. Unsets, however, are very limited in where they can appear. They can only appear at the top-level of an expression, and any computation done with them is invalid. For example, `SET list_column = [3, :var]` is invalid if `:var` is bound to unset. This causes the code to be littered with checks for unset, and there are plenty of tests dedicated to catching unsets. However, a simpler way is possible - prevent the infiltration of unsets at the point of entry (when evaluating a bind variable expression), and introduce guards to check for the few cases where unsets are allowed. This is what this long patch does. It performs the following: (general) 1. unset is removed from the possible values of cql3::raw_value and cql3::raw_value_view. (external->cql3) 2. query_options is fortified with a vector of booleans, unset_bind_variable_vector, where each boolean corresponds to a bind variable index and is true when it is unset. 3. To avoid churn, two compatiblity structs are introduced: cql3::raw_value{,_view}_vector_with_unset, which can be constructed from a std::vector<raw_value{,_view/}>, which is what most callers have. They can also be constructed with explicit unset vectors, for the few cases they are needed. (cql3->variables) 4. query_options::get_value_at() now throws if the requested bind variable is unset. This replaces all the throwing checks in expression evaluation and statement execution, which are removed. 5. A new query_options::is_unset() is added for the users that can tolerate unset; though it is not used directly. 6. A new cql3::unset_operation_guard class guards against unsets. It accepts an expression, and can be queried whether an unset is present. Two conditions are checked: the expression must be a singleton bind variable, and at runtime it must be bound to an unset value. 7. The modification_statement operations are split into two, via two new subclasses of cql3::operation. cql3::operation_no_unset_support ignores unsets completely. cql3::operation_skip_if_unset checks if an operand is unset (luckily all operations have at most one operand that tolerates unset) and applies unset_operation_guard to it. 8. The various sites that accept expressions or operations are modified to check for should_skip_operation(). This are the loops around operations in update_statement and delete_statement, and the checks for unset in attributes (LIMIT and PER PARTITION LIMIT) (tests) 9. Many unset tests are removed. It's now impossible to enter an unset value into the expression evaluation machinery (there's just no unset value), so it's impossible to test for it. 10. Other unset tests now have to be invoked via bind variables, since there's no way to create an unset cql3::expr::constant. 11. Many tests have their exception message match strings relaxed. Since unsets are now checked very early, we don't know the context where they happen. It would be possible to reintroduce it (by adding a format string parameter to cql3::unset_operation_guard), but it seems not to be worth the effort. Usage of unsets is rare, and it is explicit (at least with the Python driver, an unset cannot be introduced by ommission). I tried as an alternative to wrap cql3::raw_value{,_view} (that doesn't recognize unsets) with cql3::maybe_unset_value (that does), but that caused huge amounts of churn, so I abandoned that in favor of the current approach. Closes #12517	2023-01-16 21:10:56 +02:00
Marcin Maliszkiewicz	6f055ca5f9	alternator: evaluate expressions as false for stored malformed binary data We'll try to distinguish the case when data comes from the storage rather than user reuqest. Such attribute can be used in expressions and when it can't be decoded it should make expression evaluate as false to simply exclude the row during filter query or scan. Note that this change focuses on binary type, for other types we may have some inconsistencies in the implementation.	2023-01-16 15:15:27 +01:00
Marcin Maliszkiewicz	bcbaccc143	rjson: avoid copy constructors in from_string calls when possible This function anyway copies the value so no need to do extra copy.	2023-01-16 15:15:26 +01:00
Kamil Braun	7510144fba	Merge 'Add replace-node-first-boot option' from Benny Halevy Allow replacing a node given its Host ID rather than its ip address. This series adds a replace_node_first_boot option to db/config and makes use of it in storage_service. The new option takes priority over the legacy replace_address* options. When the latter are used, a deprecation warning is printed. Documentation updated respectively. And a cql unit_test is added. Ref #12277 Closes #12316 * github.com:scylladb/scylladb: docs: document the new replace_node_first_boot option dist/docker: support --replace-node-first-boot db: config: describe replace_address* options as deprecated test: test_topology: test replace using host_id test: pylib: ServerInfo: add host_id storage_service: get rid of get_replace_address storage_service: is_replacing: rely directly on config options storage_service: pass replacement_info to run_replace_ops storage_service: pass replacement_info to booststrap storage_service: join_token_ring: reuse replacement_info.address storage_service: replacement_info: add replace address init: do not allow cfg.replace_node_first_boot of seed node db: config: add replace_node_first_boot option	2023-01-16 15:08:31 +01:00
Marcin Maliszkiewicz	668fffb6c5	alternator: remove unused parameters from describe_items func	2023-01-16 14:36:23 +01:00
Marcin Maliszkiewicz	86dc1bfdb1	utils: throw error on malformed input in base64 decode We already fixed the case of missing padding but there is also more generic one where input for decode function contains non base64 characters. This is mostly done for alternator purpose, it should discard the request containing such data and return 400 http error. Addionally some harmless integer overflow during integer casting was fixed here. This was attempted to be fixed by `2d33a3f` but since we also implicitly cast to uint8_t the problem persisted.	2023-01-16 14:36:23 +01:00
Marcin Maliszkiewicz	f53c0fd0fc	utils: throw error on missing padding in base64 decode This is done to make alternator behavior more on a pair with dynamodb. Decode function is used there when processing user requests containing binary item values. We will now discard improperly formed user input with 400 http error. It also makes it more consistent as some of our other base64 functions may have assumed padding is present. The patch should not break other usages of base64 functions as the only one is in db/hints where the code already throws std::runtime_error. Fixes #6487	2023-01-16 14:36:23 +01:00
Michał Sala	bbbe12af43	forward_service: fix timeout support in parallel aggregates `forward_request` verb carried information about timeouts using `lowres_clock::time_point` (that came from local steady clock `seastar::lowres_clock`). The time point was produced on one node and later compared against other node `lowres_clock`. That behavior was wrong (`lowres_clock::time_point`s produced with different `lowres_clock`s cannot be compared) and could lead to delayed or premature timeout. To fix this issue, `lowres_clock::time_point` was replaced with `lowres_system_clock::time_point` in `forward_request` verb. Representation to which both time point types serialize is the same (64-bit integer denoting the count of elapsed nanoseconds), so it was possible to do an in-place switch of those types using logic suggested by @avikivity: - using steady_clock is just broken, so we aren't taking anything from users by breaking it further - once all nodes are upgraded, it magically starts to work Closes #12529	2023-01-16 12:08:13 +02:00
Botond Dénes	3d9ab1d9eb	Merge 'Get recursive tasks' statuses with task manager api call' from Aleksandra Martyniuk The PR adds an api call allowing to get the statuses of a given task and all its descendants. The parent-child tree is traversed in BFS order and the list of statuses is returned to user. Closes #12317 * github.com:scylladb/scylladb: test: add test checking recursive task status api: get task statuses recursively api: change retrieve_status signature	2023-01-16 11:44:50 +02:00
Botond Dénes	969beebe5f	reader_concurrency_semaphore: wrap wait list The wait list will become two lists soon. To keep callers simple (as if there was still one list) we wrap it with a wrapper which abstracts this away.	2023-01-16 02:05:27 -05:00
Botond Dénes	8658cfc066	reader_concurrency_semaphore: add {serialize,kill}_limit_multiplier parameters Propagate the recently added reader_concurrency_semaphore_{serialize,kill}_limit_multiplier config items to the semaphore. Not used yet.	2023-01-16 02:05:27 -05:00
Botond Dénes	24d4b484f2	test/boost/reader_concurrency_semaphore_test: dummy_file_impl: don't use hardoced buffer size In `dma_read_bulk()`, use the `range_size` passed as parameter and have the callers pass meaningful sizes. We got away with callers passing 0 and using a hard-coded size internally because the tracking file wrapper used the size of the returned buffer as the basis for memory tracking. This will soon not be the case and instead the passed-in size will be used, so this has to be fixed.	2023-01-16 02:05:27 -05:00
Botond Dénes	8b0afc28d4	reader_permit: add make_new_tracked_temporary_buffer() A separate method for callers of make_tracked_temporary_buffer() who are creating new empty tracked buffers of a certain size. make_tracked_temporary_buffer() is about to be changed to be more targeted at callers who call it with pre-consumed memory units.	2023-01-16 02:05:27 -05:00
Botond Dénes	397266f420	reader_permit: add get_state() accessor	2023-01-16 02:05:27 -05:00
Botond Dénes	87e2bf90b9	reader_permit: resource_units: add constructor for already consumed res	2023-01-16 02:05:27 -05:00
Botond Dénes	d2cfc25494	reader_permit: resource_units: remove noexcept qualifier from constructor It won't be noexcept soon. Also make it exception safe.	2023-01-16 02:05:27 -05:00
Botond Dénes	7eb093899a	db/config: introduce reader_concurrency_semaphore_{serialize,kill}_limit_multiplier Will be propagated to reader concurrency semaphores. Not wired in yet.	2023-01-16 02:05:27 -05:00
Botond Dénes	a019dbaa34	scylla-gdb.py: scylla-memory: extract semaphore stats formatting code So it can be shared for the 3 semaphores, instead of repeating the same open-coded method for each of them.	2023-01-16 02:05:27 -05:00
Botond Dénes	15d6d34cfa	scylla-gdb.py: fix spelling of "graphviz"	2023-01-16 02:05:27 -05:00
Tzach Livyatan	073f0f00c6	Add Scylla Summit 2023 in the top banner Closes #12519	2023-01-16 08:05:20 +02:00
Avi Kivity	5a07641b95	Update python3 submodule (license file fix) * tools/python3 548e860...279b6c1 (1): > create-relocatable-package: s/pyhton3-libs/python3-libs/	2023-01-15 17:59:27 +02:00
Benny Halevy	de3142e540	docs: document the new replace_node_first_boot option And mention that replacing a node using the legacy replace_addr* options is deprecated. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-13 18:41:44 +02:00
Benny Halevy	d4f1563369	dist/docker: support --replace-node-first-boot And mention that replace_address_first_boot is deprecated Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-13 18:36:09 +02:00
Benny Halevy	1577aa8098	db: config: describe replace_address* options as deprecated The replace_address options are still supported But mention in their description that they are now deprecated and the user should use replace_node_first_boot instead. While at it fix a typo in ignore_dead_nodes_for_replace Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-13 18:36:09 +02:00
Benny Halevy	90faeedb77	test: test_topology: test replace using host_id Add test cases exercising the --replace-node-first-boot option by replacing nodes using their host_id rather than ip address. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-13 18:36:09 +02:00
Benny Halevy	7d0d9e28f1	test: pylib: ServerInfo: add host_id Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-13 18:36:07 +02:00
Benny Halevy	db2b76beb5	storage_service: get rid of get_replace_address It is unused now. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-13 18:34:29 +02:00
Benny Halevy	17f70e4619	storage_service: is_replacing: rely directly on config options Rather than on get_replace_address, before we remove the latter. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-13 18:34:29 +02:00
Benny Halevy	7282d58d11	storage_service: pass replacement_info to run_replace_ops So it won't need to call get_replace_address. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-13 18:34:09 +02:00
Benny Halevy	08598e4f64	storage_service: pass replacement_info to booststrap So it won't need to call get_replace_address. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-13 18:30:48 +02:00
Benny Halevy	b863f7a75f	storage_service: join_token_ring: reuse replacement_info.address Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-13 18:30:48 +02:00
Benny Halevy	add2f209b8	storage_service: replacement_info: add replace address Populate replacement_info.address in prepare_replacement_info as a first step towards getting rid of get_replace_address(). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-13 18:30:48 +02:00
Benny Halevy	75c8a5addc	init: do not allow cfg.replace_node_first_boot of seed node Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-13 18:30:48 +02:00
Benny Halevy	32e79185d4	db: config: add replace_node_first_boot option For replacing a node given its (now unique) Host ID. The existing options for replace_address* will be deprecated in the following patches and eventually we will stop supporting them. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-13 18:30:48 +02:00
Tomasz Grabiec	abc43f97c9	Merge 'Simplify some Raft tables' from Kamil Braun Rename `system.raft_config` to `system.raft_snapshot_config` to make it clearer what the table stores. Remove the `my_server_id` partition key column from `system.raft_snapshot_config` and a corresponding column from `system.raft_snapshots` which would store the Raft server ID of the local node. It's unnecessary, all servers running on a given node in different groups will use the same ID - the Raft ID of the node which is equal to its Host ID. There will be no multiple servers running in a single Raft group on the same node. Closes #12513 * github.com:scylladb/scylladb: db: system_keyspace: remove (my_)server_id column from RAFT_SNAPSHOTS and RAFT_SNAPSHOT_CONFIG db: system_keyspace: rename 'raft_config' to 'raft_snapshot_config'	2023-01-13 00:23:21 +01:00
Botond Dénes	4e41e7531c	docs/dev/debugging.md: recommend open-coredump.sh for opening coredumps Leave the guide for manual opening in though, the script might not work in all cases. Also update the version example, we changed how development versions look like. Closes #12511	2023-01-12 19:30:59 +02:00
Botond Dénes	ab8171ffd5	open-coredump.sh: handle dev versions Like: 5.2.0~dev, which really means master. Don't try to checkout branch-5.2 in this case, it doesn't exist yet, checkout master instead. Closes #12510	2023-01-12 19:28:58 +02:00
Kamil Braun	be390285b6	db: system_keyspace: remove (my_)server_id column from RAFT_SNAPSHOTS and RAFT_SNAPSHOT_CONFIG A single node will run a single Raft server in any given Raft group, so this column is not necessary.	2023-01-12 16:48:50 +01:00
Kamil Braun	bed555d1e5	db: system_keyspace: rename 'raft_config' to 'raft_snapshot_config' Make it clear that the table stores the snapshot configuration, which is not necessarily the currently operating configuration (the last one appended to the log). In the future we plan to have a separate virtual table for showing the currently operating configuration, perhaps we will call it `system.raft_config`.	2023-01-12 16:21:26 +01:00
Botond Dénes	f87e3993ef	Merge 'configure.py: a bunch of clean-up changes' from Michał Chojnowski The planned integration of cross-module optimizations in scylladb/scylladb-enterprise requires several changes to `configure.py`. To minimize the divergence between the `configure.py`s of both repositories, this series upstreams some of these changes to scylladb/scylladb. The changes mostly remove dead code and fix some traps for the unaware. Closes #12431 * github.com:scylladb/scylladb: configure.py: prevent deduplication of seastar compile options configure.py: rename clang_inline_threshold() configure.py: rework the seastar_cflags variable configure.py: hoist the pkg_config() call for seastar-testing.pc configure.py: unify the libs variable for tests and non-tests configure.py: fix indentation configure.py: remove a stale code path for .a artifacts	2023-01-12 16:40:02 +02:00
Wojciech Mitros	082bfea187	rust: use depfile and Cargo.lock to avoid building rust when unnecessary Currently, we call cargo build every time we build scylla, even when no rust files have been changed. This is avoided by adding a depfile to the ninja rule for the rust library. The rust file is generated by default during cargo build, but it uses the full paths of all depenencies that it includes, and we use relative paths. This is fixed by specifying CARGO_BUILD_DEP_INFO_BASEDIR='.', which makes it so the current path is subtracted from all generated paths. Instead of using 'always' when specifying when to run the cargo build, a dependency on Cargo.lock is added additionally to the depfile. As a result, the rust files are recompiled not only when the source files included in the depfile are modified, but also when some rust dependency is updated. Cargo may put an old cached file as a result of the build even when the Cargo.lock was recently updated. Because of that, the the build result may be older than the Cargo.lock file even if the build was just performed. This may cause ninja to rebuilt the file every following time. To avoid this, we 'touch' the build result, so that its last modification time is up to date. Because the dependency on Cargo.lock was added, the new command for the build does not modify it. Instead, the developer must update it when modifying the dependencies - the docs are updated to reflect that. Closes #12489 Fixes #12508	2023-01-12 14:44:11 +02:00
Kefu Chai	77baea2add	docs/architecture: fix typo of SyllaDB s/SyllaDB/ScyllaDB/ Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12505	2023-01-12 12:25:53 +02:00
Michał Chojnowski	1ff4abef4a	configure.py: prevent deduplication of seastar compile options In its infinite wisdom, CMake deduplicates the options passed to `target_compile_options`, making it impossible to pass options which require duplication, such as -mllvm. Passing e.g. `-mllvm;-pgso=false;-mllvm;-inline-threshold=2500` invokes the compiler `-mllvm -pgso=false -inline-threshold=2500`, breaking the options. As a workaround, CMake added the `SHELL:` syntax, which makes it possible to pass the list of options not as a CMake list, but as a shell-quoted string. Let's use it, so we can pass multiple -mllvm options.	2023-01-12 11:24:10 +01:00
Michał Chojnowski	85facefe45	configure.py: rename clang_inline_threshold() There's a global variable (the CLI argument) with the same name. Rename one of the two to avoid accidental mixups.	2023-01-12 11:24:10 +01:00
Michał Chojnowski	d9de78f6d3	configure.py: rework the seastar_cflags variable The name of this variable is misleading. What it really does is pass flags to static libraries compiled by us, not just to seastar. We will need this capability to implement cross-artifact optimizations in our build. We will also need to pass linker flags, and we will need to vary those flags depending on the build mode. This patch splits the seastar_cflags variable into per-mode lib_cflags and lib_ldflags variables. It shouldn't change the resulting build.ninja for now, but will be needed by later planned patches.	2023-01-12 11:24:10 +01:00
Michał Chojnowski	ee462a9d3c	configure.py: hoist the pkg_config() call for seastar-testing.pc Put the pkg_config() for seastar-testing.pc in the same area as the call for seastar.pc, outside of the loop. This is a cosmetic change aimed at making following commits cleaner.	2023-01-12 11:24:10 +01:00
Michał Chojnowski	c9aeeeae11	configure.py: unify the libs variable for tests and non-tests This is a cosmetic change aimed at make following commits in the same area cleaner.	2023-01-12 11:24:09 +01:00
Michał Chojnowski	10ac881ef1	configure.py: fix indentation Fix indentation after the preceeding commit.	2023-01-12 11:23:32 +01:00
Michał Chojnowski	be419adaf8	configure.py: remove a stale code path for .a artifacts Scylla haven't had `.a` artifacts for a long time (since the Urchin days, I believe), and the piece of code responsible for them is stale and untested. Remove it.	2023-01-12 11:22:49 +01:00
Botond Dénes	8a86f8d4ef	gdbinit: add ignore clause for SIG35 Another real-time even often raised in scylla, making debugging a live process annoying. Closes #12507	2023-01-12 12:13:04 +02:00
Avi Kivity	7a8a442c1e	transport: drop some dead code around v1 and v2 protocols In `424dbf43f` ("transport: drop cql protocol versions 1 and 2"), we dropped support for protocols 1 and 2, but some code remains that checks for those versions. It is now dead code, so remove it. Closes #12497	2023-01-12 12:52:19 +02:00
Avi Kivity	4de2524a42	build: update toolchain for scylla-driver package Pull updated scylla-driver package, fixing an IP change related bug [1]. [1] https://github.com/scylladb/python-driver/issues/198 Closes #12501	2023-01-11 22:16:35 +02:00
Nadav Har'El	7192283172	Merge 'doc: add the upgrade guide for ScyllaDB 5.1 to ScyllaDB Enterprise 2022.2' from Anna Stuchlik Fix https://github.com/scylladb/scylladb/issues/12315 This PR adds the upgrade guide from ScyllaDB 5.1 to ScyllaDB Enterprise 2022.2. Instead of adding separate guides per platform, I've merged the information to create one platform-agnostic guide, similar to what we did for [OSS->OSS](https://docs.scylladb.com/stable/upgrade/upgrade-opensource/upgrade-guide-from-5.0-to-5.1/) and [Enterprise->Enterprise ](https://github.com/scylladb/scylladb/pull/12339)guides. Closes #12450 * github.com:scylladb/scylladb: doc: add the new upgrade guide to the toctree and fix its name docs: add the upgrade guide from ScyllaDB 5.1 to ScyllaDB Enterprise 2022.2	2023-01-11 21:01:34 +02:00
Avi Kivity	cb2cb8a606	utils: small_vector: mark throw_out_of_range() const It can be called from the const version of small_vector::at. Closes #12493	2023-01-11 20:58:53 +02:00
Nadav Har'El	04d6402780	docs: cql-extensions.md: explain our NULL handling Our handling of NULLs in expressions is different from Cassandra's, and more uniform. For example, the filter "WHERE x = NULL" is an error in Cassandra, but supported in Scylla. Let's explain how and why. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12494	2023-01-11 20:56:50 +02:00
Wojciech Mitros	95031074a5	configure: fix the order of rust header generation Currently, no rule enforces that the cxx.h rust header is generated before compiling the .cc files generated from rust. This patch adds this dependency. Closes #12492	2023-01-11 16:55:53 +02:00
Botond Dénes	210738c9ce	Merge 'test.py: improve logging' from Kamil Braun Make it easy to see which clusters are operated on by which tests in which build modes and so on. Add some additional logs. These improvements would have saved me a lot of debugging time if I had them last week and we would have https://github.com/scylladb/scylladb/pull/12482 much faster. Closes #12483 * github.com:scylladb/scylladb: test.py: harmonize topology logs with test.py format test/pylib: additional logging during cluster setup test/pylib: prefix cluster/manager logs with the current test name test/pylib: pool: pass args and *kwargs to the build function from get() test.py: include mode in ScyllaClusterManager logs	2023-01-11 16:32:56 +02:00
Aleksandra Martyniuk	fcb3f76e78	test: add test checking recursive task status Rest api test checking whether task manager api returns recursive tasks' statuses properly in BFS order.	2023-01-11 12:34:17 +01:00
Aleksandra Martyniuk	6b79c92cb7	api: get task statuses recursively Sometimes to debug some task manager module, we may want to inspect the whole tree of descendants of some task. To make it easier, an api call getting a list of statuses of the requested task and all its descendants in BFS order is added.	2023-01-11 12:34:06 +01:00
Konstantin Osipov	f3440240ee	test.py: harmonize topology logs with test.py format We need millisecond resolution in the log to be able to correlate test log with test.py log and scylla logs. Harmonize the log format for tests which actively manage scylla servers.	2023-01-11 10:09:42 +01:00
Kamil Braun	79712185d5	test/pylib: additional logging during cluster setup This would have saved me a lot of debugging time.	2023-01-11 10:09:42 +01:00
Kamil Braun	4f7e5ee963	test/pylib: prefix cluster/manager logs with the current test name The log file produced by test.py combines logs coming from multiple concurrent test runs. Each test has its own log file as well, but this "global" log file is useful when debugging problems with topology tests, since many events related to managing clusters are stored there. Make the logs easier to read by including information about the test case that's currently performing operations such as adding new servers to clusters and so on. This includes the mode, test run name and the name of the test case. We do this by using custom `Logger` objects (instead of calling `logging.info` etc. which uses the root logger) with `LoggerAdapter`s that include the prefixes. A bit of boilerplate 'plumbing' through function parameters is required but it's mostly straightforward. This doesn't apply to all events, e.g. boost test cases which don't setup a "real" Scylla cluster. These events don't have additional prefixes. Example: ``` 17:41:43.531 INFO> [dev/topology.test_topology.1] Cluster ScyllaCluster(name: 7a414ffc-903c-11ed-bafb-f4d108a9e4a3, running: ScyllaServer(1, 127.40.246.1, 29c4ec73-8912-45ca-ae19-8bfda701a6b5), ScyllaServer(4, 127.40.246.4, 75ae2afe-ff9b-4760-9e19-cd0ed8d052e7), ScyllaServer(7, 127.40.246.7, 67a27df4-be63-4b4c-a70c-aeac0506304f), stopped: ) adding server... 17:41:43.531 INFO> [dev/topology.test_topology.1] installing Scylla server in /home/kbraun/dev/scylladb/testlog/dev/scylla-10... 17:41:43.603 INFO> [dev/topology.test_topology.1] starting server at host 127.40.246.10 in scylla-10... 17:41:43.614 INFO> [dev/topology.test_topology.2] Cluster ScyllaCluster(name: 7a497fce-903c-11ed-bafb-f4d108a9e4a3, running: ScyllaServer(2, 127.40.246.2, f59d3b1d-efbb-4657-b6d5-3fa9e9ef786e), ScyllaServer(5, 127.40.246.5, 9da16633-ce53-4d32-8687-e6b4d27e71eb), ScyllaServer(9, 127.40.246.9, e60c69cd-212d-413b-8678-dfd476d7faf5), stopped: ) adding server... 17:41:43.614 INFO> [dev/topology.test_topology.2] installing Scylla server in /home/kbraun/dev/scylladb/testlog/dev/scylla-11... 17:41:43.670 INFO> [dev/topology.test_topology.2] starting server at host 127.40.246.11 in scylla-11... ```	2023-01-11 10:09:39 +01:00
Avi Kivity	de0c31b3b6	cql3: query_options: simplify batch query_options constructor The batch constructor uses an unnecessarily complicated template, where in fact it only vector<vector<raw_value \| raw_value_view>>. Simplify the constructor to allow exactly that. Delete some confusing comments around it. Closes #12488	2023-01-11 07:54:54 +02:00
Kamil Braun	2bda0f9830	test/pylib: pool: pass args and *kwargs to the build function from get() This will be used to specify a custom logger when building new clusters before starting tests, allowing to easily pinpoint which tests are waiting for clusters to be built and what's happening to these particular clusters.	2023-01-10 17:41:54 +01:00
Kamil Braun	ff2c030bf9	test.py: include mode in ScyllaClusterManager logs The logs often mention the test run and the current test case in a given run, such as `test_topology.1` and `test_topology.1::test_add_server_add_column`. However, if we run test.py in multiple modes, the different modes might be running the same test case and the logs become confusing. To disambiguate, prefix the test run/case names with the mode name. Example: ``` Leasing Scylla cluster ScyllaCluster(name: 7a414ffc-903c-11ed-bafb-f4d108a9e4a3, running: ScyllaServer(1, 127.40.246.1, 29c4ec73-8912-45ca-ae19-8bfda701a6b5), ScyllaServer(4, 127.40.246.4, 75ae2afe-ff9b-4 760-9e19-cd0ed8d052e7), ScyllaServer(7, 127.40.246.7, 67a27df4-be63-4b4c-a70c-aeac0506304f), stopped: ) for test dev/topology.test_topology.1::test_add_server_add_column ```	2023-01-10 17:41:54 +01:00
Wojciech Mitros	e558c7d988	functions: initialize aggregates on scylla start Currently, UDAs can't be reused if Scylla has been restarted since they have been created. This is caused by the missing initialization of saved UDAs that should have inserted them to the cql3::functions::functions::_declared map, that should store all (user-)created functions and aggregates. This patch adds the missing implementation in a way that's analogous to the method of inserting UDF to the _declared map. Fixes #11309	2023-01-10 17:44:18 +02:00
Wojciech Mitros	d1b809754c	database: wrap lambda coroutines used as arguments in coroutine::lambda Using lambda coroutines as arguments can lead to a use-after-free. Currently, the way these lambdas were used in do_parse_schema_tables did not lead to such a problem, but it's better to be safe and wrap them in coroutine::lambda(), so that they can't lead to this problem as long as we ensure that the lambda finishes in the do_parse_schema_tables() statement (for example using co_await). Closes #12487	2023-01-10 17:24:52 +02:00
Nadav Har'El	0edb090c67	test/cql-pytest: add simple tests for SELECT DISTINCT This patch adds a few simple functional test for the SELECT DISTINCT feature, and how it interacts with other features especiall GROUP BY. 2 of the 5 new tests are marked xfail, and reproduce one old and one newly-discovered issue: Refs #5361: LIMIT doesn't work when using GROUP BY (the test here uses LIMIT and GROUP BY together with SELECT DISTINCT, so the LIMIT isn't honored). Refs #12479: SELECT DISTINCT doesn't refuse GROUP BY with clustering column. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12480	2023-01-10 13:29:26 +02:00
Michał Radwański	dcab289656	boost/mvcc_test: use failure_injecting_allocation_strategy where it is meant to In test_apply_is_atomic, a basic form of exception testing is used. There is failure_injecting_allocation_strategy, which however is not used for any allocation, since for some reason, `with_allocator(r.allocator()` is used instead of `with_allocator(alloc`. Fix that. Closes #12354	2023-01-10 12:01:36 +01:00
Tomasz Grabiec	ebcd736343	cache: Fix undefined behavior when populating with non-full keys Regression introduced in `23e4c8315`. view_and_holder position_in_partiton::after_key() triggers undefined behavior when the key was not full because the holder is moved, which invalidates the view. Fixes #12367 Closes #12447	2023-01-10 12:51:54 +02:00
Jan Ciolek	8d7e35caef	cql3: expr: remove reference to temporary in get_rhs_receiver The function underlying_type() returns an data_type by value, but the code assigned it to a reference. At first I was sure this is an error (assigning temporary value to a reference), but it turns out that this is most likely correct due to C++ lifetime extension rules. I think it's better to avoid such unituitive tricks. Assigning to value makes it clearer that the code is correct and there are no dangling references. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com> Closes #12485	2023-01-10 09:42:49 +02:00
Raphael "Raph" Carvalho	407c7fdaf2	docs: Fix command to create a symbolic link to relocatable pkg dir Closes #12481	2023-01-10 07:09:14 +02:00
Kamil Braun	822410c49b	test/pylib: scylla_cluster: release IPs when cluster is no longer needed With sufficiently many test cases we would eventually run out of IP addresses, because IPs (which are leased from a global host registry) would only be released at the end of an entire test suite. In fact we already hit this during next promotions, causing much pain indeed. Release IPs when a cluster, after being marked dirty, is stopped and thrown away. Closes #12482	2023-01-10 06:59:41 +02:00
Avi Kivity	e71e1dc964	Merge 'tools/scylla-sstable: add lua scripting support' from Botond Dénes Introduce a new "script" operation, which loads a script from the specified path, then feeds the mutation fragment stream to it. The script can then extract, process and present information from the sstable as it wishes. For now only Lua scripts are supported for the simple reason that Lua is easy to write bindings for, it is simple and lightweight and more importantly we already have Lua included in the Scylla binary as it is used as the implementation language for UDF/UDA. We might consider WASM support in the future, but for now we don't have any language support in WASM available. Example: ```lua function new_stats(key) return { partition_key = key, total = 0, partition = 0, static_row = 0, clustering_row = 0, range_tombstone_change = 0, }; end total_stats = new_stats(nil); function inc_stat(stats, field) stats[field] = stats[field] + 1; stats.total = stats.total + 1; total_stats[field] = total_stats[field] + 1; total_stats.total = total_stats.total + 1; end function on_new_sstable(sst) max_partition_stats = new_stats(nil); if sst then current_sst_filename = sst.filename; else current_sst_filename = nil; end end function consume_partition_start(ps) current_partition_stats = new_stats(ps.key); inc_stat(current_partition_stats, "partition"); end function consume_static_row(sr) inc_stat(current_partition_stats, "static_row"); end function consume_clustering_row(cr) inc_stat(current_partition_stats, "clustering_row"); end function consume_range_tombstone_change(crt) inc_stat(current_partition_stats, "range_tombstone_change"); end function consume_partition_end() if current_partition_stats.total > max_partition_stats.total then max_partition_stats = current_partition_stats; end end function on_end_of_sstable() if current_sst_filename then print(string.format("Stats for sstable %s:", current_sst_filename)); else print("Stats for stream:"); end print(string.format("\t%d fragments in %d partitions - %d static rows, %d clustering rows and %d range tombstone changes", total_stats.total, total_stats.partition, total_stats.static_row, total_stats.clustering_row, total_stats.range_tombstone_change)); print(string.format("\tPartition with max number of fragments (%d): %s - %d static rows, %d clustering rows and %d range tombstone changes", max_partition_stats.total, max_partition_stats.partition_key, max_partition_stats.static_row, max_partition_stats.clustering_row, max_partition_stats.range_tombstone_change)); end ``` Running this script wilt yield the following: ``` $ scylla sstable script --script-file fragment-stats.lua --system-schema system_schema.columns /var/lib/scylla/data/system_schema/columns-24101c25a2ae3af787c1b40ee1aca33f/me-1-big-Data.db Stats for sstable /var/lib/scylla/data/system_schema/columns-24101c25a2ae3af787c1b40ee1aca33f//me-1-big-Data.db: 397 fragments in 7 partitions - 0 static rows, 362 clustering rows and 28 range tombstone changes Partition with max number of fragments (180): system - 0 static rows, 179 clustering rows and 0 range tombstone changes ``` Fixes: https://github.com/scylladb/scylladb/issues/9679 Closes #11649 * github.com:scylladb/scylladb: tools/scylla-sstable: consume_reader(): improve pause heuristincs test/cql-pytest/test_tools.py: add test for scylla-sstable script tools: add scylla-sstable-scripts directory tools/scylla-sstable: remove custom operation tools/scylla-sstable: add script operation tools/sstable: introduce the Lua sstable consumer dht/i_partitioner.hh: ring_position_ext: add weight() accessor lang/lua: export Scylla <-> lua type conversion methods lang/lua: use correct lib name for string lib lang/lua: fix type in aligned_used_data (meant to be user_data) lang/lua: use lua_State* in Scylla type <-> Lua type conversions tools/sstable_consumer: more consistent method naming tools/scylla-sstable: extract sstable_consumer interface into own header tools/json_writer: add accessor to underlying writer tools/scylla-sstable: fix indentation tools/scylla-sstable: export mutation_fragment_json_writer declaration tools/scylla-sstable: mutation_fragment_json_writer un-implement sstable_consumer tools/scylla-sstable: extract json writing logic from json_dumper tools/scylla-sstable: extract json_writer into its own header tools/scylla-sstable: use json_writer::DataKey() to write all keys tools/scylla-types: fix use-after-free on main lambda captures	2023-01-09 20:54:42 +02:00
Raphael S. Carvalho	05ffb024bb	replica: Kill table::calculate_shard_from_sstable_generation() Inferring shard from generation is long gone. We still use it in some scripts, but that's no longer needed in Scylla, when loading the SSTables, and it also conflicts with ongoing work of UUID-based generations. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #12476	2023-01-09 20:17:57 +02:00
Takuya ASADA	548c9e36a1	main: add tcp_timestamps sanity check Check net.ipv4.tcp_timestamps, show warning message when it's not set to 1. Fixes #12144 Closes #12199	2023-01-09 19:08:21 +02:00
Nadav Har'El	d6e6820f33	Merge 'Drop support for cql binary protocols versions 1 and 2' from Avi Kivity The CQL binary protocol version 3 was introduced in 2014. All Scylla version support it, and Cassandra versions 2.1 and newer. Versions 1 and 2 have 16-bit collection sizes, while protocol 3 and newer use 32-bit collection sizes. Unfortunately, we implemented support for multiple serialization formats very intrusively, by pushing the format everywhere. This avoids the need to re-serialize (sometimes) but is quite obnoxious. It's also likely to be broken, since it's almost untested and it's too easy to write cql_serialization_format::internal() instead of propagating the client specified value. Since protocols 1 and 2 are obsolete for 9 years, just drop them. It's easy to verify that they are no longer in use on a running system by examining the `system.clients` table before upgrade. Fixes #10607 Closes #12432 * github.com:scylladb/scylladb: treewide: drop cql_serialization_format cql: modification_statement: drop protocol check for LWT transport: drop cql protocol versions 1 and 2	2023-01-09 18:52:41 +02:00
Botond Dénes	bd42da6e69	tools/scylla-sstable: consume_reader(): improve pause heuristincs The consume loop had some heuristics in place to determine whether after pausing, the consumer wishes to skip just the partition or the remaining content of the sstable. This heuristics was flawed so replace it with a non-heuristic method: track the last consumed fragment and look at this to determine what should be done.	2023-01-09 09:46:57 -05:00
Botond Dénes	1d222220e0	test/cql-pytest/test_tools.py: add test for scylla-sstable script To test the script operation, we use some of the example scripts from the example directory. Namely, dump.lua and slice.lua. These two scripts together have a very good coverage of the entire script API. Testing their functionality therefore also provides a good coverage of the lua bindings. A further advantage is that since both scripts dump output in identical format to that of the data-dump operation, it is trivial to do a comparison against this already tested operation. A targeted test is written for the sstable skip functionality of the consumer API.	2023-01-09 09:46:57 -05:00
Botond Dénes	ace42202df	tools: add scylla-sstable-scripts directory To be the home of example scripts for scylla-sstable. For now only a README.md is added describing the directory's purpose and with links to useful resources. One example script is added in this patch, more will come later.	2023-01-09 09:46:57 -05:00
Botond Dénes	7b40463f29	tools/scylla-sstable: remove custom operation We now have a script operation, the custom operation (poor man's script operation) has no reason to exist anymore.	2023-01-09 09:46:57 -05:00
Botond Dénes	e5071fdeab	tools/scylla-sstable: add script operation Loads the script from the specified path, then feeds the mutation fragment stream to it. For now only Lua scripts are supported for the simple reason that Lua is easy to write bindings for, it is simple and lightweight and more importantly we already have Lua included in the Scylla binary as it is used as the implementation language for UDF/UDA. We might consider WASM support in the future, but for now we don't have any language support in WASM available.	2023-01-09 09:46:57 -05:00
Botond Dénes	9dd5107919	tools/sstable: introduce the Lua sstable consumer The Lua sstable consumer loads a script from the specified path then feeds the mutation fragment stream to the script via the sstable_consumer methods, each method of which the script is allowed to define, effectively overloading the virtual method in Lua. This allows for very wide and flexible customization opportunities for what to extract from sstables and how to process and present them, without the need to recompile the scylla-sstable tool.	2023-01-09 09:46:57 -05:00
Botond Dénes	50b155e706	dht/i_partitioner.hh: ring_position_ext: add weight() accessor	2023-01-09 09:46:57 -05:00
Botond Dénes	8699fe5001	lang/lua: export Scylla <-> lua type conversion methods Currently hidden in lang/lua.cc, declare these in a header so others can use it.	2023-01-09 09:46:57 -05:00
Botond Dénes	e9a52837cf	lang/lua: use correct lib name for string lib AFAIK the mistake had no real consequence, but still it is nicer to have it correct.	2023-01-09 09:46:57 -05:00
Botond Dénes	76663d7774	lang/lua: fix type in aligned_used_data (meant to be user_data)	2023-01-09 09:46:57 -05:00
Botond Dénes	943fc3b6f3	lang/lua: use lua_State* in Scylla type <-> Lua type conversions Instead of the lua_slice_state which is local to this file. We want to reuse the Scylla type <-> Lua type conversion functions but for that they have to use the more generic lua_State*. No functionality or convenience is lost with the switch, the code didn't make use of the other fields bundled in lua_slice_state.	2023-01-09 09:46:57 -05:00
Botond Dénes	8045751867	tools/sstable_consumer: more consistent method naming Use `consume_` consistently across the entire interface, instead of having some methods with `on_` and others with `consume_` prefixes.	2023-01-09 09:46:57 -05:00
Botond Dénes	8e117501ac	tools/scylla-sstable: extract sstable_consumer interface into own header So it can be used in code outside scylla-sstable.cc. This source file is quite large already, and as we have yet another large chunk of code to add, we want to add it in a separate file.	2023-01-09 09:46:57 -05:00
Botond Dénes	9b1c486051	tools/json_writer: add accessor to underlying writer	2023-01-09 09:46:57 -05:00
Botond Dénes	cfb5afbe9b	tools/scylla-sstable: fix indentation Left broken by previous patches.	2023-01-09 09:46:57 -05:00
Botond Dénes	d42b0bb5d5	tools/scylla-sstable: export mutation_fragment_json_writer declaration To json_writer.hh. Method definition are left in scylla-sstable.cc. Indentation is left broken, will be fixed by the next patch.	2023-01-09 09:46:57 -05:00
Botond Dénes	517135e155	tools/scylla-sstable: mutation_fragment_json_writer un-implement sstable_consumer There is no point in the former implementing said interface. For one it is a futurized interface, which is not needed for something writing to the stdout. Rename the methods to follow the naming convention of rjson writers more closely.	2023-01-09 09:46:57 -05:00
Botond Dénes	0ee1c6ca57	tools/scylla-sstable: extract json writing logic from json_dumper We want to split this class into two parts: one with the actual logic converting mutation fragments to json, and a wrapper over this one, which implements the sstable_consumer interface. As a first step we extract the class as is (no changes) and just forward all-calls from now empty wrapper to it.	2023-01-09 09:46:57 -05:00
Botond Dénes	55ef0ed421	tools/scylla-sstable: extract json_writer into its own header Other source files will want to use it soon.	2023-01-09 09:46:57 -05:00
Botond Dénes	8623818a8d	tools/scylla-sstable: use json_writer::DataKey() to write all keys This method was renamed from its previous name of PartitionKey. Since in json partition keys and clustering keys look alike, with the only difference being that the former may also have a token, it makes to have a single method to write them (with an optional token parameter). This was the case at some point, json_dumper::write_key() taking this role. However at a later point, json_writer::PartitionKey() was introduced and now the code uses both. Standardize on the latter and give it a more generic name.	2023-01-09 09:46:57 -05:00
Botond Dénes	602fca0a12	tools/scylla-types: fix use-after-free on main lambda captures The main lambda of scylla-types, the one passed to app_template::run() was recently made a coroytine. app_template::run() however doesn't keep this lambda alive and hence after the first suspention point, accessing the lambda's captures triggers use-after-free. The simple fix is to convert the coroutine into continuation chain.	2023-01-09 09:46:57 -05:00
Tomasz Grabiec	f97268d8f2	row_cache: Fix violation of the "oldest version are evicted first" when evicting last dummy Consider the following MVCC state of a partition: v2: ==== <7> [entry2] ==== <9> ===== <last dummy> v1: ================================ <last dummy> [entry1] Where === means a continuous range and --- means a discontinuous range. After two LRU items are evicted (entry1 and entry2), we will end up with: v2: ---------------------- <9> ===== <last dummy> v1: ================================ <last dummy> [entry1] This will cause readers to incorrectly think there are no rows before entry <9>, because the range is continuous in v1, and continuity of a snapshot is a union of continuous intervals in all versions. The cursor will see the interval before <9> as continuous and the reader will produce no rows. This is only temporary, because current MVCC merging rules are such that the flag on the latest entry wins, so we'll end up with this once v1 is no longer needed: v2: ---------------------- <9> ===== <last dummy> ...and the reader will go to sstables to fetch the evicted rows before entry <9>, as expected. The bug is in rows_entry::on_evicted(), which treats the last dummy entry in a special way, and doesn't evict it, and doesn't clear the continuity by omission. The situation is not easy to trigger because it requires certain eviction pattern concurrent with multiple reads of the same partition in different versions, so across memtable flushes. Closes #12452	2023-01-09 16:10:52 +02:00
Avi Kivity	1bb1855757	Merge 'replica/database: fix read related metrics' from Botond Dénes Sstable read related metrics are broken for a long time now. First, the introduction of inactive reads (https://github.com/scylladb/scylladb/issues/1865) diluted this metric, as it now also contained inactive reads (contrary to the metric's name). Then, after moving the semaphore in front of the cache (`3d816b7c1`) this metric became completely broken as this metric now contains all kinds of reads: disk, in-memory and inactive ones too. This series aims to remedy this: * `scylla_database_active_reads` is fixed to only include active reads. * `scylla_database_active_reads_memory_consumption` is renamed to `scylla_database_reads_memory_consumption` and its description is brought up-to-date. * `scylla_database_disk_reads` is added to track current reads that are gone to disk. * `scylla_database_sstables_read` is added to track the number of sstables read currently. Fixes: https://github.com/scylladb/scylladb/issues/10065 Closes #12437 * github.com:scylladb/scylladb: replica/database: add disk_reads and sstables_read metrics sstables: wire in the reader_permit's sstable read count tracking reader_concurrency_semaphore: add disk_reads and sstables_read stats replica/database: fix active_reads_memory_consumption_metric replica/database: fix active_reads metric	2023-01-09 12:18:49 +02:00
Pavel Emelyanov	e20738cd7d	azure_snitch: Handle empty zone returned from IMDS Azure metadata API may return empty zone sometimes. If that happens shard-0 gets empty string as its rack, but propagates UNKNOWN_RACK to other shards. Empty zones response should be handled regardless. refs: #12185 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #12274	2023-01-09 11:57:45 +02:00
Nadav Har'El	2d845b6244	test/cql-pytest: a test for more than one equality in WHERE Cassandra refuses a request with more than one equality relation to the same column, for example DELETE FROM tbl WHERE partitionKey = ? AND partitionKey = ? It complains that partitionkey cannot be restricted by more than one relation if it includes an Equal Currently, Scylla doesn't consider such requests an error. Whether or not we should be compatible with Cassandra here is discussed in issue #12472. But as long as we do accept this query, we should be sure we do the right thing: "WHERE p = 1 AND p = 2" should match nothing (not the first, or last, value being tested..), and "WHERE p = 1 AND p = 1" should match the matches of p = 1. This patch adds a test for verify that these requests indeed yield correct results. The test is scylla_only because, as explained above, Cassandra doesn't support this feature at all. Refs #12472 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12473	2023-01-09 11:56:39 +02:00
Anna Stuchlik	b61515c871	doc: replace Scylla with ScyllaDB on the menu tree and major links; related: https://github.com/scylladb/scylla-docs/issues/3962 Closes #12456	2023-01-09 08:39:50 +02:00
Avi Kivity	42575340ba	Update seastar submodule * seastar ca586cfb8d...8889cbc198 (14): > http: request_parser: fix grammar ambiguity in field_content Fixes #12468 > sstring: use fold expression to simply copy_str_to() > sstring: use fold expression to simply str_len() > metrics: capture by move in make_function() > metrics: replace homebrew is_callable<> with is_invocable_v<> > reactor: use std::move() to avoid copy. > reactor: remove redundant semicolon. > reactor: use mutable to make std::move() work. > build: install liburing explicitly on ArchLinux. > reactor: use a for loop for submitting ios > metrics: add spaces around '=' > parallel utils: align concept with implementation > reactor: s/resize(0)/clear()/ > reactor: fix a typo in comment Closes #12469	2023-01-08 18:56:00 +02:00
Alejo Sanchez	d632e1aa7a	test/pytest: add missing import, remove unused import Add missed import time and remove unused name import. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Closes #12446	2023-01-08 17:38:46 +02:00
Avi Kivity	5ffe4fee6d	Merge 'Remove legacy half reverse' from Michał Radwański This commit removes consume_in_reverse::legacy_half_reverse, an option once used to indicate that the given key ranges are sorted descending, based on the clustering key of the start of the range, and that the range tombstones inside partition would be sorted (descending, as all the mutation fragments would) according to their end (but range tombstone would still be stored according to their start bound). As it turns out, mutation::consume, when called with legacy_half_reverse option produces invalid fragment stream, one where all the row tombstone changes come after all the clustering rows. This was not an issue, since when constructing results from the query, Scylla would not pass the tombstones to the client, but instead compact data beforehand. In this commit, the consume_in_reverse::legacy_half_reverse is removed, along with all the uses. As for the swap out in mutation_partition.cc in query_mutation and to_data_query_result: The downstream was not prepared to deal with legacy_half_reverse. mutation::consume contains ``` if (reverse == consume_in_reverse::yes) { while (!(stop_opt = consume_clustering_fragments<consume_in_reverse::yes>(_ptr->_schema, partition, consumer, cookie, is_preemptible::yes))) { co_await yield(); } } else { while (!(stop_opt = consume_clustering_fragments<consume_in_reverse::no>(_ptr->_schema, partition, consumer, cookie, is_preemptible::yes))) { co_await yield(); } } ``` So why did it work at all? to_data_query_result deals with a single slice. The used consumer (compact_for_query_v2) compacts-away the range tombstone changes, and thus the only difference between the consume_in_reverse::no and consume_in_reverse::yes was that one was ordered increasing wrt. ckeys and the second one was ordered decreasing. This property is maintained if we swap out for the consume_in_reverse::yes format. Refs: #12353 Closes #12453 * github.com:scylladb/scylladb: mutation{,_consumer,_partition}: remove consume_in_reverse::legacy_half_reverse mutation_partition_view: treat query::partition_slice::option::reversed in to_data_query_result as consume_in_reverse::yes mutation: move consume_in_reverse def to mutation_consumer.hh	2023-01-08 15:42:00 +02:00
Botond Dénes	c4688563e3	sstables: track decompressed buffers Convert decompressed temporary buffers into tracked buffers just before returning them to the upper layer. This ensures these buffers are known to the reader concurrency semaphore and it has an accurate view of the actual memory consumption of reads. Fixes: #12448 Closes #12454	2023-01-08 15:34:28 +02:00
Kamil Braun	b77df84543	test: test_topology: make test_nodes_with_different_smp less hacky The test would use a trick to start a separate Scylla cluster from the one provided originally by the test framework. This is not supported by the test framework and may cause unexpected problems. Change the test to perform regular node operations. Instead of starting a fresh cluster of 3 nodes, we join the first of these nodes to the original framework-provided cluster, then decommission the original nodes, then bootstrap the other 2 fresh nodes. Also add some logging to the test. Refs: #12438, #12442 Closes #12457	2023-01-08 15:33:17 +02:00
Avi Kivity	02c9968e73	Merge 'Add WASM UDF implementation in Rust' from Wojciech Mitros This series adds the implementation and usage of rust wasmtime bindings. The WASM UDFs introduced by this patch are interruptable and use memory allocated using the seastar allocator. This series includes #11102 (the first two commits) because #11102 required disabling wasm UDFs completely. This patch disables them in the middle of the series, and enables them again at the end. After this patch, `libwasmtime.a` can be removed from the toolchain. This patch also removes the workaround for #https://github.com/scylladb/scylladb/issues/9387 but it hasn't been tested with ARM yet - if the ARM test causes issues I'll revert this part of the change. Closes #11351 * github.com:scylladb/scylladb: build: remove references to unused c bindings of wasmtime test: assert that WASM allocations can fail without crashing wasm: limit memory allocated using mmap wasm: add configuration options for instance cache and udf execution test: check that wasmtime functions yield wasm: use the new rust bindings of wasmtime rust: add Wasmtime bindings rust: add build profiles more aligned with ninja modes rust: adjust build according to cxxbridge's recommendations tools: toolchain: dbuild: prepare for sharing cargo cache	2023-01-08 15:31:09 +02:00
Nadav Har'El	f5cda3cfc3	test/cql-pytest: add more tests for "timestamp" column type In issue #3668, a discussion spanning several years theorized that several things are wrong with the "timestamp" type. This patch begins by adding several tests that demonstrate that Scylla is in fact behaving correctly, and mostly identically to Cassandra except one esoteric error handling case. However, after eliminating the red herrings, we are left for the real issue that prompted opening #3668, which is a duplicate of issues #2693 and #2694, and this patch also adds a reproducer for that. The issue is that Cassandra 4 added support for arithmetic expressions on values, and timestamps can be added durations, for example: '2011-02-03 04:05:12.345+0000' - 1d is a valid timestamp - and we don't currently support this syntax. So the new test - which passes on Cassandra 4 and fails on Scylla (or Cassandra 3) is marked xfail. Refs #2693 Refs #2694 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12436	2023-01-08 15:00:49 +02:00
Michał Chojnowski	08b3a9c786	configure: don't reduce parsers' optimization level to 1 in release The line modified in this patch was supposed to increase the optimization levels of parsers in debug mode to 1, because they were too slow otherwise. But as a side effect, it also reduced the optimization level in release mode to 1. This is not a problem for the CQL frontend, because statement preparation is not performance-sensitive, but it is a serious performance problem for Alternator, where it lies in the hot path. Fix this by only applying the -O1 to debug modes. Fixes #12463 Closes #12460	2023-01-06 18:04:36 +02:00
Wojciech Mitros	903c4874d0	build: remove references to unused c bindings of wasmtime Before the changes intorducing the new wasmtime bindings we relied on an downloaded static library libwasmtime.a. Now that the bindings are introduced, we do not rely on it anymore, so all references to it can be removed.	2023-01-06 14:07:29 +01:00
Wojciech Mitros	996a942e05	test: assert that WASM allocations can fail without crashing The main source of big allocations in the WASM UDF implementation is the WASM Linear Memory. We do not want Scylla to crash even if a memory allocation for the WASM Memory fails, so we assert that an exception is thrown instead. The wasmtime runtime does not actually fail on an allocation failure (assuming the memory allocator does not abort and returns nullptr instead - which our seastar allocator does). What happens then depends on the failed allocation handling of the code that was compiled to WASM. If the original code threw an exception or aborted, the resulting WASM code will trap. To make sure that we can handle the trap, we need to allow wasmtime to handle SIGILL signals, because that what is used to carry information about WASM traps. The new test uses a special WASM Memory allocator that fails after n allocations, and the allocations include both memory growth instructions in WASM, as well as growing memory manually using the wasmtime API. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>	2023-01-06 14:07:29 +01:00
Wojciech Mitros	f05d612da8	wasm: limit memory allocated using mmap The wasmtime runtime allocates memory for the executable code of the WASM programs using mmap and not the seastar allocator. As a result, the memory that Scylla actually uses becomes not only the memory preallocated for the seastar allocator but the sum of that and the memory allocated for executable codes by the WASM runtime. To keep limiting the memory used by Scylla, we measure how much memory do the WASM programs use and if they use too much, compiled WASM UDFs (modules) that are currently not in use are evicted to make room. To evict a module it is required to evict all instances of this module (the underlying implementation of modules and instances uses shared pointers to the executable code). For this reason, we add reference counts to modules. Each instance using a module is a reference. When an instance is destroyed, a reference is removed. If all references to a module are removed, the executable code for this module is deallocated. The eviction of a module is actually acheved by eviction of all its references. When we want to free memory for a new module we repeatedly evict instances from the wasm_instance_cache using its LRU strategy until some module loses all its instances. This process may not succeed if the instances currently in use (so not in the cache) use too much memory - in this case the query also fails. Otherwise the new module is added to the tracking system. This strategy may evict some instances unnecessarily, but evicting modules should not happen frequently, and any more efficient solution requires an even bigger intervention into the code.	2023-01-06 14:07:29 +01:00
Wojciech Mitros	b8d28a95bf	wasm: add configuration options for instance cache and udf execution Different users may require different limits for their UDFs. This patch allows them to configure the size of their cache of wasm, the maximum size of indivitual instances stored in the cache, the time after which the instances are evicted, the fuel that all wasm UDFs are allowed to consume before yielding (for the control of latency), the fuel that wasm UDFs are allowed to consume in total (to allow performing longer computations in the UDF without detecting an infinite loop) and the hard limit of the size of UDFs that are executed (to avoid large allocations)	2023-01-06 14:07:27 +01:00
Wojciech Mitros	3214f5c2db	test: check that wasmtime functions yield The new implementation for WASM UDFs allows executing the UDFs in pieces. This commit adds a test asserting that the UDF is in fact divided and that each of the execution segments takes no longer than 1ms.	2023-01-06 14:05:53 +01:00
Wojciech Mitros	3146807192	wasm: use the new rust bindings of wasmtime This patch replaces all dependencies on the wasmtime C++ bindings with our new ones. The wasmtime.hh and wasm_engine.hh files are deleted. The libwasmtime.a library is no longer required by configure.py. The SCYLLA_ENABLE_WASMTIME macro is removed and wasm udfs are now compiled by default on all architectures. In terms of implementation, most of code using wasmtime was moved to the Rust source files. The remaining code uses names from the new bindings (which are mostly unchanged). Most of wasmtime objects are now stored as a rust::Box<>, to make it compatible with rust lifetime requirements. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>	2023-01-06 14:05:53 +01:00
Wojciech Mitros	50b24cf036	rust: add Wasmtime bindings The C++ bindings provided by wasmtime are lacking a crucial capability: asynchronous execution of the wasm functions. This forces us to stop the execution of the function after a short time to prevent increasing the latency. Fortunately, this feature is implemented in the native language of Wasmtime - Rust. Support for Rust was recently added to scylla, so we can implement the async bindings ourselves, which is done in this patch. The bindings expose all the objects necessary for creating and calling wasm functions. The majority of code implemented in Rust is a translation of code that was previously present in C++. Types exported from Rust are currently required to be defined by the same crate that contains the bridge using them, so wasmtime types can't be exported directly. Instead, for each class that was supposed to be exported, a wrapper type is created, where its first member is the wasmtime class. Note that the members are not visible from C++ anyway, the difference only applies to Rust code. Aside from wasmtime types and methods, two additional types are exported with some associated methods. - The first one is ValVec, which is a wrapper for a rust Vec of wasmtime Vals. The underlying vector is required by wasmtime methods for calling wasm functions. By having it exported we avoid multiple conversions from a Val wrapper to a wasmtime Val, as would be required if we exported a rust Vec of Val wrappers (the rust Vec itself does not require wrappers if the type it contains is already wrapped) - The second one is Fut. This class represents an computation tha may or may not be ready. We're currently using it to control the execution of wasm functions from C++. This class exposes one method: resume(), which returns a bool that signals whether the computation is finished or not. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>	2023-01-06 14:05:53 +01:00
Wojciech Mitros	33c97de25c	rust: add build profiles more aligned with ninja modes A cargo profile is created for each of build modes: dev, debug, sanitize, realease and coverage. The names of cargo profiles are prefixed by "rust-" because cargo does not allow separate "dev" and "debug" profiles. The main difference between profiles are their optimization levels, they correlate to the levels used in configure.py. The debug info is stripped only in the dev mode, and only this mode uses "incremental" compilation to speed it up.	2023-01-06 14:05:53 +01:00
Wojciech Mitros	4d7858e66d	rust: adjust build according to cxxbridge's recommendations Currently, the rust build system in Scylla creates a separate static library for each incuded rust package. This could cause duplicate symbol issues when linking against multiple libraries compiled from rust. This issue is fixed in this patch by creating a single static library to link against, which combines all rust packages implemented in Scylla. The Cargo.lock for the combined build is now tracked, so that all users of the same scylla version also use the same versions of imported rust modules. Additionally, the rust package implementation and usage docs are modified to be compatible with the build changes. This patch also adds a new header file 'rust/cxx.hh' that contains definitions of additional rust types available in c++.	2023-01-06 14:05:53 +01:00
Avi Kivity	eeaa475de9	tools: toolchain: dbuild: prepare for sharing cargo cache Rust's cargo caches downloaded sources in ~/.cargo. However dbuild won't provide access to this directory since it's outside the source directory. Prepare for sharing the cargo cache between the host and the dbuild environment by: - Creating the cache if it doesn't already exist. This is likely if the user only builds in a dbuild environment. - Propagating the cache directory as a mounted volume. - Respecting the CARGO_HOME override.	2023-01-06 14:05:53 +01:00
Avi Kivity	6868dcf30b	tools: toolchain: drop s390x from prepare script architecture list It's been a long while since we built ScyllaDB for s390x, and in fact the last time I checked it was broken on the ragel parser generator generating bad source files for the HTTP parser. So just drop it from the list. I kept s390x in the architecture mapping table since it's still valid. Closes #12455	2023-01-06 09:08:01 +02:00
Michał Radwański	1fbf433966	mutation{,_consumer,_partition}: remove consume_in_reverse::legacy_half_reverse This commit removes consume_in_reverse::legacy_half_reverse, an option once used to indicate that the given key ranges are sorted descending, based on the clustering key of the start of the range, and that the range tombstones inside partition would be sorted (descending, as all the mutation fragments would) according to their end (but range tombstone would still be stored according to their start bound). As it turns out, mutation::consume, when called with legacy_half_reverse option produces invalid fragment stream, one where all the row tombstone changes come after all the clustering rows. This was not an issue, since when constructing results from the query, Scylla would not pass the tombstones to the client, but instead compact data beforehand. In this commit, the consume_in_reverse::legacy_half_reverse is removed, along with all the uses. As for the swap out in mutation_partition.cc in query_mutation and to_data_query_result: The downstream was not prepared to deal with legacy_half_reverse. mutation::consume contains ``` if (reverse == consume_in_reverse::yes) { while (!(stop_opt = consume_clustering_fragments<consume_in_reverse::yes>(_ptr->_schema, partition, consumer, cookie, is_preemptible::yes))) { co_await yield(); } } else { while (!(stop_opt = consume_clustering_fragments<consume_in_reverse::no>(_ptr->_schema, partition, consumer, cookie, is_preemptible::yes))) { co_await yield(); } } ``` So why did it work at all? to_data_query_result deals with a single slice. The used consumer (compact_for_query_v2) compacts-away the range tombstone changes, and thus the only difference between the consume_in_reverse::no and consume_in_reverse::yes was that one was ordered increasing wrt. ckeys and the second one was ordered decreasing. This property is maintained if we swap out for the consume_in_reverse::yes format.	2023-01-05 18:48:55 +01:00
Botond Dénes	2612f98a6c	Merge 'Abort repair tasks' from Aleksandra Martyniuk Aborting of repair operation is fully managed by task manager. Repair tasks are aborted: - on shutdown; top level repair tasks subscribe to global abort source. On shutdown all tasks are aborted recursively - through node operations (applies to data_sync_repair_task_impls and their descendants only); data_sync_repair_task_impl subscribes to node_ops_info abort source - with task manager api (top level tasks are abortable) - with storage_service api and on failure; these cases were modified to be aborted the same way as the ones from above are. Closes #12085 * github.com:scylladb/scylladb: repair: make top level repair tasks abortable repair: unify a way of aborting repair operations repair: delete sharded abort source from node_ops_info repair: delete unused node_ops_info from data_sync_repair_task_impl repair: delete redundant abort subscription from shard_repair_task_impl repair: add abort subscription to data sync task tasks: abort tasks on system shutdown	2023-01-05 15:21:35 +01:00
Avi Kivity	cc6010b512	Merge 'Make restore_replica_count abortable' from Benny Halevy Similar to the way we allow aborting streaming-based removenode, subscribe to storage_service::_abort_source to request abort locally and pass a shared_ptr<abort_source> to `node_ops_info`, used to abort removenode_with_repair on shutdown. Fixes #12429 Closes #12430 * github.com:scylladb/scylladb: storage_service: restore_replica_count: demote status_checker related logging to debug level storage_service: restore_replica_count: allow aborting removenode_with_repair storage_service: coroutinize restore_replica_count storage_service: restore_replica_count: undefer stop_status_checker storage_service: restore_replica_count: handle exceptions from stream_async and send_replication_notification storage_service: restore_replica_count: coroutinize status_checker	2023-01-05 15:21:35 +01:00
Kamil Braun	09da661eeb	Merge 'raft: replace experimental raft option with dedicated flag' from Gleb Natapov Unlike other experimental feature we want to raft to be opt in even after it leaves experimental mode. For that we need to have a separate option to enable it. The patch adds the binary option "consistent-cluster-management" for that. * 'consistent-cluster-management-flag' of github.com:scylladb/scylla-dev: raft: replace experimental raft option with dedicated flag main: move supervisor notification about group registry start where it actually starts	2023-01-05 15:21:35 +01:00
Anna Stuchlik	44e6f18d1b	doc: add the new upgrade guide to the toctree and fix its name	2023-01-05 14:13:33 +01:00
Anna Stuchlik	0ad2e3e63a	docs: add the upgrade guide from ScyllaDB 5.1 to ScyllaDB Enterprise 2022.2	2023-01-05 13:30:10 +01:00
Aleksandra Martyniuk	dcb91457da	api: change retrieve_status signature Sometimes we may need task status to be nothrow move constructible. httpd::task_manager_json::task_status does not satisfy this requirement. retrieve_status returns future<full_task_status> instead of future<task_status> to provide an intermediate struct with better properties. An argument is passed by reference to prevent the necessity to copy foreign_ptr.	2023-01-05 13:28:51 +01:00
Kamil Braun	df72536fc5	Merge 'docs: add the upgrade guide for Enterprise from 2022.1 to 2022.2' from Anna Stuchlik Fixes https://github.com/scylladb/scylladb/issues/12314 This PR adds the upgrade guide for ScyllaDB Enterprise - from version 2022.1 to 2022.2. Using this opportunity, I've replaced "Scylla" with "ScyllaDB" in the upgrade-enterprise index file. In previous releases, we added several upgrade guides - one per platform (and version). In this PR, I've merged the information for different platforms to create one generic upgrade guide. It is similar to what @kbr- added for the Open Source upgrade guide from 5.0 to 5.1. See https://docs.scylladb.com/stable/upgrade/upgrade-opensource/upgrade-guide-from-5.0-to-5.1/. Closes #12339 * github.com:scylladb/scylladb: docs: add the info about minor release docs: add the new upgade guide 2022.1 to 2022.2 to the index and the toctree docs: add the index file for the new upgrage guide from 2022.1 to 2022.2 docs: add the metrics update file to the upgrade guide 2022.1 to 2022.2 docs: add the upgrade guide for ScyllaDB Enterprise from 2022.1 to 2022.2	2023-01-04 18:07:00 +01:00
Benny Halevy	086546f575	storage_service: restore_replica_count: demote status_checker related logging to debug level the status_checker is not the main line of business of restore_replica_count, starting and stopping it do nt seem to deserve info level logging, which might have been useful in the past to debug issues surrounding that. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-04 19:05:04 +02:00
Benny Halevy	3879ee1db8	storage_service: restore_replica_count: allow aborting removenode_with_repair Similar to the way we allow aborting streaming-based removenode, subscribe to storage_service::_abort_source to request abort locally and pass a shared_ptr<abort_source> to `node_ops_info`, used to abort removenode_with_repair on shutdown. Fixes #12429 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-04 19:05:04 +02:00
Benny Halevy	afece5bdc4	storage_service: coroutinize restore_replica_count and unwrap the async thread started for streaming. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-04 19:05:04 +02:00
Benny Halevy	d1eadc39c1	storage_service: restore_replica_count: undefer stop_status_checker Now that all exceptions in the rest of the function are swallowed, just execute the stop_status_checker deferred action serially before returning, on the wau to coroutinizing restore_replica_count (since we can't co_await status_checker inside the deferred action). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-04 19:05:04 +02:00
Benny Halevy	788ecb738d	storage_service: restore_replica_count: handle exceptions from stream_async and send_replication_notification On the way to coroutinizing restore_replica_count, extract awaiting stream_async and send_replication_notification into a try/catch blocks so we can later undefer stop_status_checker. The exception is still returned as an exceptional future which is logged by the caller as warning. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-04 19:02:42 +02:00
Benny Halevy	b54d121dfd	storage_service: restore_replica_count: coroutinize status_checker There is no need to start a thread for the status_checker and can be implemented using a background coroutine. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-01-04 19:02:20 +02:00
Botond Dénes	1d273a98b9	readers/multishard: shard_reader::close() silence read-ahead timeouts Timouts are benign, especially on a read-ahead that turned out to be not needed at all. They just introduce noise in the logs, so silence them. Fixes: #12435 Closes #12441	2023-01-04 16:10:09 +02:00
Anna Stuchlik	9216b657c8	doc: fix the version in the comment on removing the note	2023-01-04 14:01:33 +01:00
Kamil Braun	4268b1bbc2	Merge 'raft: raft_group0, register RPC verbs on all shards' from Gusev Petr raft_group0 used to register RPC verbs only on shard 0. This worked on clusters with the same --smp setting on all nodes, since RPCs in this case are processed on the same shard as the calling code, and raft_group0 methods only run on shard 0. A new test test_nodes_with_different_smp was added to identify the problem. Since --smp can only be specified via the command line, a corresponding parameter was added to the ManagerClient.server_add method. It allows to override the default parameters set by the SCYLLA_CMDLINE_OPTIONS variable by changing, adding or deleting individual items. Fixes: #12252 Closes #12374 * github.com:scylladb/scylladb: raft: raft_group0, register RPC verbs on all shards raft: raft_append_entries, copy entries to the target shard test.py, allow to specify the node's command line in test	2023-01-04 11:11:21 +01:00
Marcin Maliszkiewicz	61a9816bad	utils/rjson: enable inlining in rapidjson library Due to lack of NDEBUG macro inlining was disabled. It's important for parsing and printing performance. Testing with perf_simple_query shows that it reduced around 7000 insns/op, thus increasing median tps by 4.2% for the alternator frontend. Because inlined functions are called for every character in json this scales with request/response size. When default write size is increased by around 7x (from ~180 to ~ 1255 bytes) then the median tps increased by 12%. Running: ./build/release/test/perf/perf_simple_query_g --smp 1 \ --alternator forbid --default-log-level error \ --random-seed=1235000092 --duration=60 --write Results before the patch: median 46011.50 tps (197.1 allocs/op, 12.1 tasks/op, 170989 insns/op, 0 errors) median absolute deviation: 296.05 maximum: 46548.07 minimum: 42955.49 Results after the patch: median 47974.79 tps (197.1 allocs/op, 12.1 tasks/op, 163723 insns/op, 0 errors) median absolute deviation: 303.06 maximum: 48517.53 minimum: 44083.74 The change affects both json parsing and printing. Closes #12440	2023-01-04 10:27:35 +02:00
Michał Jadwiszczak	83bb77b8bb	test/boost/cql_query_test: enable `parallelized_aggregation` Run tests for parallelized aggregation with `enable_parallelized_aggregation` set always to true, so the tests work even if the default value of the option is false. Closes #12409	2023-01-04 10:11:25 +02:00
Anna Stuchlik	c4d779e447	doc: Fix https://github.com/scylladb/scylla-doc-issues/issues/854 - update the procedure to update topology strategy when nodes are on different racks Closes #12439	2023-01-04 09:50:10 +02:00
Avi Kivity	2739ac66ed	treewide: drop cql_serialization_format Now that we don't accept cql protocol version 1 or 2, we can drop cql_serialization format everywhere, except when in the IDL (since it's part of the inter-node protocol). A few functions had duplicate versions, one with and one without a cql_serialization_format parameter. They are deduplicated. Care is taken that `partition_slice`, which communicates the cql_serialization_format across nodes, still presents a valid cql_serialization_format to other nodes when transmitting itself and rejects protocol 1 and 2 serialization\ format when receiving. The IDL is unchanged. One test checking the 16-bit serialization format is removed.	2023-01-03 19:54:13 +02:00
Avi Kivity	654b96660a	cql: modification_statement: drop protocol check for LWT CQL protocol 1 did not support LWT, but since we don't support it any more, we can drop the check and the supporting get_protocol_version() helper.	2023-01-03 19:51:57 +02:00
Avi Kivity	424dbf43f3	transport: drop cql protocol versions 1 and 2 Version 3 was introduced in 2014 (Cassandra 2.1) and was supported in the very first version of Scylla (`2a7da21481` "CQL binary protocol"). Cassandra 3.0 (2015) dropped protocols 1 and 2 as well. It's safe enough to drop it now, 9 years after introduction of v3 and 7 years after Cassandra stopped supporting it. Dropping it allows dropping cql_serialization_format, which causes quite a lot of pain, and is probably broken. This will be dropped in the following patch.	2023-01-03 19:47:49 +02:00
Avi Kivity	f600ad5c1b	Update seastar submodule * seastar 3db15b5681...ca586cfb8d (28): > reactor: trim returned buffer to received number of bytes > util/process: include used header > build: drop unused target_include_directories() > build: use BUILD_IN_SOURCE instead chdir <SOURCE_DIR> > build: specify CMake policy CMP0135 to new > tests: only destroy allocated pending connections > build: silence the output when generating private keys > tests, httpd: Limit loopback connection factory sharding > lw_shared_ptr: Add nullptr_t comparing operators > noncopyable_function: Add concept for (Func func) constructor > reactor: add process::terminate() and process::kill() > Merge 'tests, include: include headers without ".." in path' from Kefu Chai > build: customize toolset for building Boost > build: use different toolset base on specified compiler > allocator: add an option to reserve additional memory for the OS > Merge 'build: pass cflags and ldflags to cooking.sh' from Kefu Chai > build: build static library of cryptopp > gate: add gate holders debugging > build: detect debug build of yaml-cpp also > build: do not use pkg_search_module(IMPORTED_TARGET) for finding yaml-cpp > build: bump yaml-cpp to 0.7.0 in cooking_recipe > build: bump cryptopp to 8.7.0 in cooking_recipe > build: bump boost to 1.81.0 in cooking_recipe > build: bump fmtlib to 9.1.0 in cooking_recipe > shared_ptr: add overloads for fmt::ptr() > chunked_fifo: const_iterator: use the base class ctor > build: s/URING_LIBARIES/URING_LIBRARIES/ > build: export the full path of uring with URING_LIBRARIES Closes #12434	2023-01-03 17:58:31 +02:00
Alejo Sanchez	889acf710c	test/python: increase CQL connection timeout for... test_ssl In very slow debug builds the default driver timeouts are too low and tests might fail. Bump up the values to a more reasonable time. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Closes #12408	2023-01-03 17:10:46 +02:00
Nadav Har'El	1c96d2134f	docs,alternator: link to issue about missing ACL feature The alternator compatibility.md document mentions the missing ACL (access control) feature, but unlike other missing features we forgot to link to the open issue about this missing feature. So let's add that link. Refs #5047. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12399	2023-01-03 16:50:33 +02:00
Kamil Braun	fc57626afa	Merge 'docs: remove auto_bootstrap option from the documentation' from Anna Stuchlik Fixes https://github.com/scylladb/scylladb/issues/12318 This PR removes all occurrences of the `auto_bootstrap` option in the docs. In most cases, I've simply removed the option name and its definition, but sometimes additional changes were necessary: - In node-joined-without-any-data.rst, I removed the `auto_bootstrap `option as one of the causes of the problem. - In rebuild-node.rst, I removed the first step in the procedure (enabling the `auto_bootstrap `option). - In admin. rst, I removed the section about manual bootstrapping - it's based on setting `auto_bootstrap` to false, which is not possible now. Closes #12419 * github.com:scylladb/scylladb: docs: remove the auto_bootstrap option from the admin procedures - involves removing the Manual Bootstraping section docs: remove the auto_bootstrap option from the procedure to replace a dead node docs: remove the auto_bootstrap option from the Troubleshooting article about a node joining with no data docs: remove the auto_bootstrap option from the procedure to rebuild a node after losing the data volume docs: remove the auto_bootstrap option from the procedures to create a cluster or add a DC	2023-01-03 15:44:00 +01:00
Botond Dénes	e4d5b2a373	replica/database: add disk_reads and sstables_read metrics Tracking the current number of reads gone to disk and the current number of sstables read by all such reads respectively.	2023-01-03 09:37:29 -05:00
Botond Dénes	2acfa950d7	sstables: wire in the reader_permit's sstable read count tracking Hook in the relevant methods when creating and destroying sstable readers.	2023-01-03 09:37:29 -05:00
Botond Dénes	2c0de50969	reader_concurrency_semaphore: add disk_reads and sstables_read stats And the infrastructure to reader_permit to update them. The infrastructure is not wired in yet. These metrics will be used to count the number of reads gone to disk and the number of sstables read currently respectively.	2023-01-03 09:37:29 -05:00
Botond Dénes	dcd2deb5af	replica/database: fix active_reads_memory_consumption_metric Rename to reads_memory_consumption and drop the "active" from the description as well. This metric tracks the memory consumption of all reads: active or inactive. We don't even currently have a way to track the memory consumption of only active reads. Drop the part of the description which explains the interaction with other metrics: this part is outdated and the new interactions are much more complicated, no way to explain in a metric description. Also ask the semaphore to calculate the memory amount, instead of doing it in the metric itself.	2023-01-03 09:25:47 -05:00
Petr Gusev	8417840647	raft: raft_group0, register RPC verbs on all shards raft_group0 used to register RPC verbs only on shard 0. This worked on clusters with the same --smp setting on all nodes, since RPCs in this case are (usually) processed on the same shard as the calling code, and raft_group0 methods only run on shard 0. A new test test_nodes_with_different_smp was added to identify the problem. Fixes: #12252	2023-01-03 17:04:07 +03:00
Anna Stuchlik	00ef20c3df	docs: remove the auto_bootstrap option from the admin procedures - involves removing the Manual Bootstraping section	2023-01-03 14:48:01 +01:00
Anna Stuchlik	b7d62b2fc7	docs: remove the auto_bootstrap option from the procedure to replace a dead node	2023-01-03 14:47:55 +01:00
Anna Stuchlik	bc62e61df1	docs: remove the auto_bootstrap option from the Troubleshooting article about a node joining with no data	2023-01-03 14:46:38 +01:00
Anna Stuchlik	1602f27cd7	docs: remove the auto_bootstrap option from the procedure to rebuild a node after losing the data volume	2023-01-03 14:45:08 +01:00
Botond Dénes	929481ea9c	replica/database: fix active_reads metric This metric has been broken for a long time, since inactive reads were introduced. As calculated currently, it includes all permits that passed admission, including inactive reads. On the other hand, it excludes permits created bypassing admission. Fix by using the newly introduced (in this patch) reader_concurrency_semaphore::active_reads() as the basis of this metric: this now includes all permits (reads) that are currently active, excluding waiters and inactive reads.	2023-01-03 08:12:25 -05:00
Petr Gusev	7725e03a09	raft: raft_append_entries, copy entries to the target shard If append_entries RPC was received on a non-zero shard, we may need to pass it to a zero (or, potentially, some other) shard. The problem is that raft::append_request contains entries in the form of raft::log_entry_ptr == lw_shared_ptr<log_entry>, which doesn't support cross-shard reference counting. In debug mode it contains a special ref-counting facility debug_shared_ptr_counter_type, which resorts to on_internal_error if it detects such a case. To solve this, we just copy log entries to the target shard if it isn't equal to the current one. In most cases, if --smp setting is the same on all nodes, RPC will be handled on zero shard, so there will be no overhead.	2023-01-03 15:25:00 +03:00
Petr Gusev	1c23390f12	test.py, allow to specify the node's command line in test An optional parameter cmdline has been added to the ManagerClient.server_add method. It allows you to override the default parameters set by the SCYLLA_CMDLINE_OPTIONS variable by changing, adding or deleting individual items. To change or add a parameter just specify its name and value one after the other. To remove parameter use the special keyword __remove__ as a value. To set a parameter without a value (such as --overprovisioned) use the special keyword __missing__ as the value.	2023-01-03 15:24:54 +03:00
Nadav Har'El	eb85f136c8	cql-pytest: document how to write new cql-pytest tests Add to test/cql-pytest/README.md an explanation of the philosophy of the cql-pytest test suite, and some guideliness on how to write good tests in that framework. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12400	2023-01-03 12:13:22 +02:00
Anna Stuchlik	994bc33147	docs: fix the command on the Manager-Monitoring Integration troubleshooting page Closes #12375	2023-01-03 11:41:16 +02:00
Anna Stuchlik	9d17d812c0	docs: Fix https://github.com/scylladb/scylla-doc-issues/issues/870 , update the nodetool rebuild command Closes #12416	2023-01-03 11:40:40 +02:00
Gleb Natapov	1688163233	raft: replace experimental raft option with dedicated flag Unlike other experimental feature we want to raft to be optional even after it leaves experimental mode. For that we need to have a separate option to enable it. The patch adds the binary option "consistent-cluster-management" for that.	2023-01-03 11:15:11 +02:00
Gleb Natapov	29060cc235	main: move supervisor notification about group registry start where it actually starts `99fe580068` moved raft_group_registry::start call a bit later, but forget to move supervisor notification call. Do it now.	2023-01-03 11:09:30 +02:00
Botond Dénes	2ef71e9c70	Merge 'Improve verbosity of task manager api' from Aleksandra Martyniuk The PR introduces changes to task manager api: - extends tasks' list returned with get_tasks with task type, keyspace, table, entity, and sequence number - extends status returned with get_task_status and wait_task with a list of children's ids Closes #12338 * github.com:scylladb/scylladb: api: extend status in task manager api api: extend get_tasks in task manager api	2023-01-03 10:39:41 +02:00
Botond Dénes	82101b786d	Merge 'docs: document scylla-api-client' from Anna Stuchlik Fixes https://github.com/scylladb/scylladb/issues/11999. This PR adds a description of scylla-api-cli. Closes #12392 * github.com:scylladb/scylladb: docs: fix the description of the system log POST example docs: uptate the curl tool name docs: describe how to use the scylla-api-client tool docs: fix the scylla-api-client tool name docs: document scylla-api-cli	2023-01-03 10:30:04 +02:00
Benny Halevy	63c2cdafe8	sstables: index_reader: close(index_bound&) reset current_list When closing _lower_bound and *_upper_bound in the final close() call, they are currently left with an engaged current_list member. If the index_reader uses a _local_index_cache, it is evicted with evict_gently which will, rightfully, see the respective pages as referenced, and they won't be evicted gently (only later when the index_reader is destroyed). Reset index_bound.current_list on close(index_bound&) to free up the reference. Ref #12271 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #12370	2023-01-02 16:42:33 +01:00
Avi Kivity	767b7be8be	Merge 'Get rid of handle_state_replacing' from Benny Halevy Since [repair: Always use run_replace_ops](`2ec1f719de`), nodes no longer publish HIBERNATE state so we don't need to support handling it. Replace is now always done using node operations (using repair or streaming). so nodes are never expected to change status to HIBERNATE. Therefore storage_service:handle_state_replacing is not needed anymore. This series gets rid of it and updates documentation related to STATUS:HIBERNATE respectively. Fixes #12330 Closes #12349 * github.com:scylladb/scylladb: docs: replace-dead-node: get rid of hibernate status storage_service: get rid of handle_state_replacing	2023-01-02 13:35:29 +02:00
Gleb Natapov	28952d32ff	storage_service: move leave_ring outside of unbootstrap() We want to reuse the later without the call. Message-Id: <20221228144944.3299711-17-gleb@scylladb.com>	2023-01-02 12:03:29 +02:00
Gleb Natapov	229cef136d	raft: add trace logging to raft::server::start Allows to see initial state of the server during start. Message-Id: <20221228144944.3299711-15-gleb@scylladb.com>	2023-01-02 11:57:53 +02:00
Gleb Natapov	96453ff75f	service: raft: improve group0_state_machine::apply logging Trace how many entries are applied as well. Message-Id: <20221228144944.3299711-14-gleb@scylladb.com>	2023-01-02 11:57:16 +02:00
Gleb Natapov	dbd5b97201	storage_service: improve logging in update_pending_ranges() function We pass the reason for the change. Log it as well. Message-Id: <20221228144944.3299711-11-gleb@scylladb.com>	2023-01-02 11:54:03 +02:00
Gleb Natapov	04ab673359	messaging: check that a node knows its own topology before accessing it We already check is remote's node topology is missing before creating a connection, but local node topology can be missing too when we will use raft to manage it. Raft needs to be able to create connections before topology is knows. Message-Id: <20221228144944.3299711-7-gleb@scylladb.com>	2023-01-02 11:53:14 +02:00
Gleb Natapov	6f104982e1	topology: use std::erase_if on std::map instead of ad-hoc loop There is std::erase_if since c++20. We can use it here. Message-Id: <20221228144944.3299711-6-gleb@scylladb.com>	2023-01-02 11:45:52 +02:00
Gleb Natapov	84eb5924ac	system_keyspace: remove redundant include storage_proxy.hh is included twice Message-Id: <20221228144944.3299711-4-gleb@scylladb.com>	2023-01-02 11:39:22 +02:00
Gleb Natapov	5182543df2	raft: fix typo in read_barrier logging The log logs applied index not append one. Message-Id: <20221228144944.3299711-3-gleb@scylladb.com>	2023-01-02 11:38:47 +02:00
Gleb Natapov	5a96751534	storage_service: remove start_leaving since it is no longer used Message-Id: <20221228144944.3299711-2-gleb@scylladb.com>	2023-01-02 11:37:48 +02:00
Raphael S. Carvalho	b4e4bbd64a	database_test: Reduce x_log2_compaction_group values to avoid timeout database_test in timing out because it's having to run the tests calling do_with_cql_env_and_compaction_groups 3x, one for each compaction group setting. reduce it to 2 settings instead of 3 if running in debug mode. Refs #12396. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #12421	2023-01-01 13:56:18 +02:00
Raphael S. Carvalho' via ScyllaDB development	a7c4a129cb	sstables: Bump row_reads metrics for mx version Metric was always 0 despite a row was processed by mx reader. Fixes #12406. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20221227220202.295790-1-raphaelsc@scylladb.com>	2022-12-30 18:38:30 +01:00
Anna Stuchlik	601aeb924a	docs: remove the auto_bootstrap option from the procedures to create a cluster or add a DC	2022-12-30 13:10:06 +01:00
Anna Stuchlik	705b347d36	doc: extend the information about the recommended RF on the Tracing page	2022-12-30 11:30:20 +01:00
Avi Kivity	8635d24424	build: drop abseil submodule, replace with distribution abseil This lets us carry fewer things and rely on the distribution for maintenance. The frozen toolchain is updated. Incidental updates include clang 15.0.6, and pytest that doesn't need workarounds. Closes #12397	2022-12-28 19:02:23 +02:00
Avi Kivity	eced91b575	Revert "view: coroutinize maybe_mark_view_as_built" This reverts commit `ac2e2f8883`. It causes a regression ("std::bad_variant_access in load_view_build_progress"). Commit `2978052113` (a reindent) is also reverted as part of the process. Fixes #12395	2022-12-28 15:36:05 +02:00
Anna Stuchlik	6d70665185	doc: extend the information on removing an unavailable node	2022-12-28 13:19:58 +01:00
Anna Stuchlik	f95c6423c1	docs: extend the warning on the Remove a Node page	2022-12-28 13:16:36 +01:00
Nadav Har'El	200bc82913	test/cql-pytest: exit immediately if Scylla is down In commit `acfa180766` we added to test/cql-pytest a mechanism to detect when Scylla crashes in the middle of a test function - in which case we report the culprit test and exit immediately to avoid having a hundred more tests report that they failed as well just because Scylla was down. However, if Scylla was never up - e.g., if the user ran "pytest" without ever running Scylla - we still report hundreds of tests as having failed, which is confusing and not helpful. So with this patch, if a connection cannot be made to Scylla at all, the test exits immediately, explaining what went wrong, not blaming any specific test: $ pytest ... ! _pytest.outcomes.Exit: Cannot connect to Scylla at --host=localhost --port=9042 ! ============================ no tests ran in 0.55s ============================= Beyond being a helpful reminder for a developer who runs "pytest" without having started Scylla first (or using test/cql-pytest/run or test.py to start Scylla easily), this patch is also important when running tests through test.py if it reuses an instance of Scylla that crashed during an earlier pytest file's run. This patch does not fix test.py - it can still try to run pytest with a dead Scylla server without checking. But at least with this patch pytest will notice this problem immediately and won't report hundreds of test functions having failed. The only report the user will see will be the last test which crashed Scylla, which will make it easier to find this failure without being hidden between hundreds of spurious failures. Fixes #12360 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12401	2022-12-28 13:04:28 +02:00
Anna Stuchlik	d0db1a27c3	docs: fix the description of the system log POST example	2022-12-28 11:25:54 +01:00
Anna Stuchlik	b7ec99b10b	docs: uptate the curl tool name	2022-12-28 10:33:07 +01:00
Asias He	b9e5e340aa	streaming: Enable offstrategy for all classic streaming based node ops This patch enables offstrategy compaction for all classic streaming based node ops. We can use this method because tables are streamed one after another. As long as there is still streamed data for a given table, we update the automatic trigger timer. When all the streaming has finished, the trigger timer will timeout and fire the offstrategy compaction for the given table. I checked with this patch, rebuild is 3X faster. There was no compaction in the middle of the streaming. The streamed sstables are compacted together after streaming is done. Time Before: INFO 2022-11-25 10:06:08,213 [shard 0] range_streamer - Rebuild succeeded, took 67 seconds, nr_ranges_remaining=0 Time After: INFO 2022-11-25 09:42:50,943 [shard 0] range_streamer - Rebuild succeeded, took 23 seconds, nr_ranges_remaining=0 Compaciton Before: 88 sstables were written -> 88 sstables were added into main set Compaction After: 88 sstables written -> after offstretegy 2 sstables were added into main seet Closes #11848	2022-12-28 11:12:02 +02:00
Michał Chojnowski	5e79d6b30b	tasks: task_manager: move invoke_on_task<> to .hh invoke_on_task is used in translation units where its definition is not visible, yet it has no explicit instantiations. If the compiler always decides to inline the definition, not to instantiate it implicitly, linking invoke_on_task will fail. (It happened to me when I turned up inline-threshold). Fix that. Closes #12387	2022-12-28 10:55:43 +02:00
Alejo Sanchez	d408b711e3	test/python: increase CQL connection timeouts In very slow debug builds the default driver timeouts are too low and tests might fail. Bump up the values to more reasonable time. These timeout values are the same as used in topology tests. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Closes #12405	2022-12-28 10:06:33 +02:00
Anna Stuchlik	39ade2f5a5	docs: describe how to use the scylla-api-client tool	2022-12-27 14:46:16 +01:00
Anna Stuchlik	2789501023	docs: fix the scylla-api-client tool name	2022-12-27 14:28:27 +01:00
Alejo Sanchez	1bfe234133	test/pylib: API get/set logger level of Scylla server Provide helpers to get and set logger level for Scylla servers. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Closes #12394	2022-12-25 13:58:43 +02:00
Anna Stuchlik	ea7e23bf92	docs: fix the option name from compaction to compression on the Data Definition page Fixes the option name in the "Other table options" table on the Data Definition page. Fixes #12334 Closes #12382	2022-12-25 11:24:56 +02:00
Botond Dénes	b0d95948e1	mutation_compactor: reset stop flag on page start When the mutation compactor has all the rows it needs for a page, it saves the decision to stop in a member flag: _stop. For single partition queries, the mutation compactor is kept alive across pages and so it has a method, start_new_page() to reset its state for the next page. This method didn't clear the _stop flag. This meant that the value set at the end of the previous could cause the new page and subsequently the entire query to be stopped prematurely. This can happen if the new page starts with a row that is covered by a higher level tombstone and is completely empty after compaction. Reset the _stop flag in start_new_page() to prevent this. This commit also adds a unit test which reproduces the bug. Fixes: #12361 Closes #12384	2022-12-24 13:52:45 +02:00
Takuya ASADA	642d035067	docker: prevent hostname -i failure when server address is specified On some docker instance configuration, hostname resolution does not work, so our script will fail on startup because we use hostname -i to construct cqlshrc. To prevent the error, we can use --rpc-address or --listen-address for the address since it should be same. Fixes #12011 Closes #12115	2022-12-24 13:52:16 +02:00
Asias He	d819d98e78	storage_service: Ignore dropped table for repair_updater In case a table is dropped, we should ignore it in the repair_updater, since we can not update off strategy trigger for a dropped table. Refs #12373 Closes #12388	2022-12-24 13:48:25 +02:00
Raphael S. Carvalho	67ebd70e6e	compaction_manager: Fix reactor stalls during periodic submissions Every 1 hour, compaction manager will submit all registered table_state for a regular compaction attempt, all without yielding. This can potentially cause a reactor stall if there are 1000s of table states, as compaction strategy heuristics will run on behalf of each, and processing all buckets and picking the best one is not cheap. This problem can be magnified with compaction groups, as each group is represented by a table state. This might appear in dashboard as periodic stalls, every 1h, misleading the investigator into believing that the problem is caused by a chronological job. This is fixed by piggybacking on compaction reevaluation loop which can yield between each submission attempt if needed. Fixes #12390. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #12391	2022-12-24 13:43:16 +02:00
Anna Stuchlik	74fd776751	docs: document scylla-api-cli	2022-12-23 11:27:37 +01:00
Benny Halevy	8797958dfc	schema: operator<<: print also tombstone_gc_options They are currently missing from the printout when the a table is created, but they are determinal to understanding the mode with which tombstones are to be garbage-collected in the table. gcGraceSeconds alone is no longer enough since the introduction of tombstone_gc_option in `a8ad385ecd`. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #12381	2022-12-22 16:40:18 +02:00
Anna Stuchlik	7e8977bf2d	docs: add the info about minor release	2022-12-22 10:26:33 +01:00
Nadav Har'El	ef2e5675ed	materialized views, test: add tests for CLUSTERING ORDER BY In issue #10767, concerned were raised that the CLUSTERING ORDER BY clause is handled incorrectly in a CREATE MATERIALIZED VIEW definition. The tests in this patch try to explore the different ways in which CLUSTERING ORDER BY can be used in CREATE MATERIALIZED VIEW and allows us to compare Scylla's behaivor to Cassandra, and to common sense. The tests discover that the CLUSTERING ORDER BY feature in materialized views generally works as expected, but there are three differences between Scylla and Cassandra in this feature. We consider two differences to be bugs (and hence the test is marked xfail) and one a Scylla extension: 1. When a base table has a reverse-order clustering column and this clustering column is used in the materialized view, in Cassandra the view's clustering order inherits the reversed order. In Scylla, the view's clustering order reverts to the default order. Arguably, both behaviors can be justified, but usually when in doubt we should implement Cassandra's behavior - not pick a different behavior, even if the different behavior is also reasonable. So this test (test_mv_inherit_clustering_order()) is marked "xfail", and a new issue was created about this difference: #12308. If we want to fix this behavior to match Cassandra's we should also consider backward compatibility - what happens if we change this behavior in Scylla now, after we had the opposite behavior in previous releases? We may choose to enshrine Scylla's Cassandra- incompatible behavior here - and document this difference. 2. The CLUSTERING ORDER BY should, as its name suggests, only list clustering columns. In Scylla, specifying other things, like regular columns, partition-key columns, or non-existent columns, is silently ignored, whereas it should result in an Invalid Request error (as it does in Cassandra). So test_mv_override_clustering_order_error() is marked "xfail". This is the difference already discovered in #10767. 3. When a materialized view has several clustering columns, Cassandra requires that a CLUSTERING ORDER BY clause, if present, must specify the order of all of all clustering columns. Scylla, in contrast, allows the user to override the order of only some of these columns - and the rest get the default order. I consider this to be a legitimate Scylla extension, and not a compatibility bug, so marked the test with "scylla_only", and no issue was opened about it. Refs #10767 Refs #12308 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12307	2022-12-22 09:48:16 +02:00
Nadav Har'El	6d2e146aa6	test/cql-pytest.py: add scylla_inject_error() utility This patch adds a scylla_inject_error(), a context manager which tests can use to temporarily enable some error injection while some test code is running. It can be used to write tests that artificially inject certain errors instead of trying to reach the elaborate (and often requiring precise timing or high amounts of data) situation where they occur naturally. The error-injection API is Scylla-specific (it uses the Scylla REST API) and does not work on "release"-mode builds (all other modes are supported), so when Cassandra or release-mode build are being tested, the test which uses scylla_inject_error() gets skipped. Example usage: ```python from rest_api import scylla_inject_error with scylla_inject_error(cql, "injection_name", one_shot=True): # do something here ... ``` Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12264	2022-12-22 09:39:10 +02:00
Nadav Har'El	01f0644b22	Merge 'scylla-gdb.py: introduce `scylla get-config-value`' from Botond Dénes Retrieves the configuration item with the given name and prints its value as well as its metadata. Example: (gdb) scylla get-config-value compaction_static_shares value: 100, type: "float", source: SettingsFile, status: Used, live: MustRestart Closes #12362 * github.com:scylladb/scylladb: scylla-gdb.py: add scylla get-config-value gdb command scylla-gdb.py: extract $downcast_vptr logic to standalone method test: scylla-gdb/run: improve diagnostics for failed tests	2022-12-21 18:38:23 +02:00
Aleksandra Martyniuk	599fce16cf	repair: make top level repair tasks abortable	2022-12-21 11:52:58 +01:00
Aleksandra Martyniuk	e77de463e4	repair: unify a way of aborting repair operations	2022-12-21 11:52:53 +01:00
Aleksandra Martyniuk	f56e886127	repair: delete sharded abort source from node_ops_info Sharded abort source in node_ops_info is no longer needed since its functionality is provided by task manager's tasks structure.	2022-12-21 11:37:03 +01:00
Aleksandra Martyniuk	18efe0a4e8	repair: delete unused node_ops_info from data_sync_repair_task_impl	2022-12-21 11:28:30 +01:00
Aleksandra Martyniuk	ee13a5dde8	api: extend status in task manager api Status of tasks returned with get_task_status and wait_task is extended with the list of ids of child tasks.	2022-12-21 10:54:56 +01:00
Aleksandra Martyniuk	697af4ccf2	api: extend get_tasks in task manager api Each task stats in a list returned from tm::get_task api call is extended with info about: task type, keyspace, table, entity, and sequence number.	2022-12-21 10:54:50 +01:00
Michał Chojnowski	19049150ef	configure.py: remove --static, --pie, --so These options have been nonsense since 2017. --pie and --so are ignored, --static disables (sic!) static linking of libraries. Remove them. Closes #12366	2022-12-21 11:01:56 +02:00
Botond Dénes	29d49e829e	scylla-gdb.py: add scylla get-config-value gdb command Retrieves the configuration item with the given name and prints its value as well as its metadata. Example: (gdb) scylla get-config-value compaction_static_shares value: 100, type: "float", source: SettingsFile, status: Used, live: MustRestart	2022-12-21 03:05:56 -05:00
Botond Dénes	0cdb89868a	scylla-gdb.py: extract $downcast_vptr logic to standalone method So it can be reused by regular python code.	2022-12-21 03:05:56 -05:00
Botond Dénes	24022c19a6	test: scylla-gdb/run: improve diagnostics for failed tests By instructing gdb to print the full python stack in case of errors.	2022-12-21 03:05:56 -05:00
Michał Chojnowski	d9269abf5b	sstables: index_reader: always evict the local cache gently Due to an oversight, the local index cache isn't evicted gently when _upper_bound existed. This is a source of reactor stalls. Fix that. Fixes #12271 Closes #12364	2022-12-20 18:23:27 +02:00
Michał Radwański	e7fbcd6c9d	mutation_partition_view: treat query::partition_slice::option::reversed in to_data_query_result as consume_in_reverse::yes The consume_in_reverse::legacy_half_reverse format is soon to be phased out. This commit starts treating frozen_mutations from replicas for reversed queries so that they are consumed with consume_in_reverse::yes.	2022-12-20 17:05:02 +01:00
Benny Halevy	1adb2bff18	mutation: move consume_in_reverse def to mutation_consumer.hh To be used also by frozen_mutation consumer. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-12-20 16:23:10 +01:00
Avi Kivity	bb731b4f52	Merge 'docs: move documentation of tools online' from Botond Dénes Currently the scylla tools (`scylla-types` and `scylla-sstable`) have documentation in two places: high level documentation can be found at `docs/operating-scylla/admin-tools/scylla-{types,sstable}.rst`, while low level, more detailed documentation is embedded in the tool itself. This is especially pronounced for `scylla-sstable`, which only has a short description of its operations online, all details being found only in the command-line help. We want to move away from this model, such that all documentation can be found online, with the command-line help being reserved to documenting how the various switches and flags work, on top of a short description of the operation and a link to the detailed online docs. Closes #12284 * github.com:scylladb/scylladb: tool/scylla-sstable: move documentation online docs: scylla-sstable.rst: add sstable content section docs: scylla-{sstable,types}.rst: drop Syntax section	2022-12-20 17:04:47 +02:00
Avi Kivity	3fce43124a	Merge 'Static compaction groups' from Raphael "Raph" Carvalho Allows static configuration of number of compaction groups per table per shard. To bootstrap the project, config option x_log2_compaction_groups was added which controls both number of groups and partitioning within a shard. With a value of 0 (default), it means 1 compaction group, therefore all tokens go there. With a value of 3, it means 8 compaction groups, and 3 most-significant-bits of tokens being used to decide which group owns the token. And so on. It's still missing: - integration with repair / streaming - integration with reshard / reshape. perf/perf_simple_query --smp 1 --memory 1G BEFORE ----- median 61358.55 tps ( 71.1 allocs/op, 12.2 tasks/op, 56375 insns/op, 0 errors) median 61322.80 tps ( 71.1 allocs/op, 12.2 tasks/op, 56391 insns/op, 0 errors) median 61058.58 tps ( 71.1 allocs/op, 12.2 tasks/op, 56386 insns/op, 0 errors) median 61040.94 tps ( 71.1 allocs/op, 12.2 tasks/op, 56381 insns/op, 0 errors) median 61118.40 tps ( 71.1 allocs/op, 12.2 tasks/op, 56379 insns/op, 0 errors) AFTER ----- median 61656.12 tps ( 71.1 allocs/op, 12.2 tasks/op, 56486 insns/op, 0 errors) median 61483.29 tps ( 71.1 allocs/op, 12.2 tasks/op, 56495 insns/op, 0 errors) median 61638.05 tps ( 71.1 allocs/op, 12.2 tasks/op, 56494 insns/op, 0 errors) median 61726.09 tps ( 71.1 allocs/op, 12.2 tasks/op, 56509 insns/op, 0 errors) median 61537.55 tps ( 71.1 allocs/op, 12.2 tasks/op, 56491 insns/op, 0 errors) Closes #12139 * github.com:scylladb/scylladb: test: mutation_test: Test multiple compaction groups test: database_test: Test multiple compaction groups test: database_test: Adapt it to compaction groups db: Add config for setting static number of compaction groups replica: Introduce static compaction groups test: sstable_test: Stop referencing single compaction group api: compaction_manager: Stop a compaction type for all groups api: Estimate pending tasks on all compaction groups api: storage_service: Run maintenance compactions on all compaction groups replica: table: Adapt assertion to compaction groups replica: database: stop and disable compaction on behalf of all groups replica: Introduce table::parallel_foreach_table_state() replica: disable auto compaction on behalf of all groups replica: table: Rework compaction triggers for compaction groups replica: Adapt table::get_sstables_including_compacted_undeleted() to compaction groups replica: Adapt table::rebuild_statistics() to compaction groups replica: table: Perform major compaction on behalf of all groups replica: table: Perform off-strategy compaction on behalf of all groups replica: table: Perform cleanup compaction on behalf of all groups replica: Extend table::discard_sstables() to operate on all compaction groups replica: table: Create compound sstable set for all groups replica: table: Set compaction strategy on behalf of all groups replica: table: Return min memtable timestamp across all groups replica: Adapt table::stop() to compaction groups replica: Adapt table::clear() to compaction groups replica: Adapt table::can_flush() to compaction groups replica: Adapt table::flush() to compaction groups replica: Introduce parallel_foreach_compaction_group() replica: Adapt table::set_schema() to compaction groups replica: Add memtables from all compaction groups for reads replica: Add memtable_count() method to compaction_group replica: table: Reserve reader list capacity through a callback replica: Extract addition of memtables to reader list into a new function replica: Adapt table::occupancy() to compaction groups replica: Adapt table::active_memtable() to compaction groups replica: Introduce table::compaction_groups() replica: Preparation for multiple compaction groups scylla-gdb: Fix backward compatibility of scylla_memtables command	2022-12-20 17:04:47 +02:00
Avi Kivity	623be22d25	Merge 'sstables: allow bypassing min max position metadata loading' from Botond Dénes Said mechanism broke tools and tests to some extent: the read it executes on sstable load time means that if the sstable is broken enough to fail this read, it will fail to load, preventing diagnostic tools to load it and examine it and preventing tests from producing broken sstables for testing purposes. Closes #12359 * github.com:scylladb/scylladb: sstables: allow bypassing first/last position metadata loading sstables: sstable::{load,open_data}(): fix indentation sstables: coroutinize sstable::open_data() sstables: sstable::open_data(): use clear_gently() to clear token ranges sstables: coroutinize sstable::load()	2022-12-20 17:04:47 +02:00
Aleksandra Martyniuk	60e298fda1	repair: change utils::UUID to node_ops_id Type of the id of node operations is changed from utils::UUID to node_ops_id. This way the id of node operations would be easily distinguished from the ids of other entities. Closes #11673	2022-12-20 17:04:47 +02:00
Avi Kivity	88a1fbd72f	Update seastar submodule * seastar 3a5db04197...3db15b5681 (27): > build: get the full path of c-ares > build: unbreak pkgconfig output > http: Add 206 Partial Content response code > http: Carry integer content_length on reply > tls_test: drop duplicated includes > tls_test: remove duplicated test case > reactor: define __NR_pidfd_open if not defined > sockets: Wait on socket peer closing the connection > tcp: Close connection when getting RST from server > Merge 'Enhance rpc tester with delays, timeouts and verbosity' from Pavel Emelyanov > Merge 'build: use pkg_search_module(.. IMPORTED_TARGET ..) ' from Kefu Chai > build: define GnuTLS_{LIBRARIES, INCLUDE_DIRS} only if GnuTLS is found > build: use pkg_search_module(.. IMPORTED_TARGET ..) > addr2line: extend asan regex > abort_source: move-assign operator: call base class unlink > coroutine: correct syntax error in doxygen comment > demo: Extend http connection demo with https > test: temporarily disable warning for tests triggering warnings > tests/unit/coroutine: Include <ranges> > sstring: Document why sstring exists at all > test: log error when read/write to pipe fails > test: use executables in /bin > tests: spawn_test: use BOOST_CHECK_EQUAL() for checking equality of temporary_buffer > docker: bump up to clang {14,15} and gcc {11,12} > shared_ptr: ignore false alarm from GCC-12 > build: check for fix of CWG2631 > circleci: use versioned container image Closes #12355	2022-12-20 17:04:47 +02:00
Botond Dénes	3c8949d34c	sstables: allow bypassing first/last position metadata loading When loading an sstable. Tests and tools might want to do this to be able to load a damaged sstable to do tests/diagnostics on it.	2022-12-20 01:45:38 -05:00
Botond Dénes	bba956c13c	sstables: sstable::{load,open_data}(): fix indentation	2022-12-20 01:45:38 -05:00
Botond Dénes	c85ff7945d	sstables: coroutinize sstable::open_data() Used once when sstable is opened on startup, not performance sensitive.	2022-12-20 01:45:38 -05:00
Botond Dénes	15966a0b1b	sstables: sstable::open_data(): use clear_gently() to clear token ranges Instead of an open-coded loop. It also makes the code easier to coroutinize (next patch).	2022-12-20 01:45:22 -05:00
Nadav Har'El	08c8e0d282	test/alternator: enable tests for long strings of consecutive tombstones In the past we had issue #7933 where very long strings of consecutive tombstones caused Alternator's paging to take an unbounded amount of time and/or memory for a single page. This issue was fixed (by commit `e9cbc9ee85`) but the two tests we had reproducing that issue were left with the "xfail" mark. They were also marked "veryslow" - each taking about 100 seconds - so they didn't run by default so nobody noticed they started to pass. In this patch I make these tests much faster (taking less than a second together), confirm that they pass - and remove the "xfail" mark and improve their descriptions. The trick to making these tests faster is to not create a million tombstones like we used to: We now know that after string of just 10,000 tombstones ('query_tombstone_page_limit') the page should end, so we can check specifically this number. The story is more complicated for partition tombstones, but there too it should be a multiple of query_tombstone_page_limit. To make the tests even faster, we change run.py to lower the query_tombstone_page_limit from the default 10,000 to 1000. The tests work correctly even without this change, but they are ten times faster with it. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12350	2022-12-20 07:08:36 +02:00
Botond Dénes	94f3fb341f	Merge 'Fix nix devenv' from Michael Livshin * Update Nixpkgs base * Clarify some comments * Get rid of custom-packaged cxxbridge (it's now present in Nixpkgs as cxx-rs) * Add missing libraries (libdeflate, libxcrypt) * Fix expected hash of the gdb patch * Fix a couple of small build problems Fixes #12259 Closes #12346 * github.com:scylladb/scylladb: build: fix Nix devenv cql3: mark several private fields as maybe_unused configure.py: link with more abseil libs	2022-12-20 07:01:06 +02:00
Michael Livshin	7c383c6249	build: fix Nix devenv * Update Nixpkgs base * Clarify some comments * Get rid of custom-packaged cxxbridge (it's now present in Nixpkgs as cxx-rs) * Add missing libraries (libdeflate, libxcrypt) * Fix expected hash of the gdb patch * Bump Python driver to 3.25.20-scylla Fixes #12259	2022-12-19 20:53:07 +02:00
Michael Livshin	4407828766	cql3: mark several private fields as maybe_unused Because they are indeed unused -- they are initialized, passed down through some layers, but not actually used. No idea why only Clang 12 in debug mode in Nix devenv complains about it, though.	2022-12-19 20:53:07 +02:00
Michael Livshin	c0c8afb79e	configure.py: link with more abseil libs Specifically libabsl_strings{,_internal}.a. This fixes failure to link tests in the Nix devenv; since presumably all is good in other setups, it must be something weird having to do with inlining? The extra linked libraries shouldn't hurt in any case.	2022-12-19 20:53:07 +02:00
Raphael S. Carvalho	e7380bea65	test: mutation_test: Test multiple compaction groups Extends mutation_test to run the tests with more than one compaction group, in addition to a single one (default). Piggyback on existing tests. Avoids duplication. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 12:36:07 -03:00
Raphael S. Carvalho	e3e7c3c7e5	test: database_test: Test multiple compaction groups Extends database_test to run the tests with more than one compaction group, in addition to a single one (default). Piggyback on existing tests. Avoids duplication. Caught a bug when snapshotting, in implementation of table::can_flush(), showing its usefulness. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 12:36:07 -03:00
Raphael S. Carvalho	e103e41c76	test: database_test: Adapt it to compaction groups Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 12:36:05 -03:00
Aleksandra Martyniuk	be529cc209	repair: delete redundant abort subscription from shard_repair_task_impl data_sync_repair_task_impl subscribes to corresponding node_ops_info abort source and then, when requested, all its descedants are aborted recursively. Thus, shard_repair_task_impl does not need to subscribe to the node_ops_info abort source, since the parent task will take care of aborting once it is requested. abort_subscription and connected attributes are deleted from the shard_repair_task_impl.	2022-12-19 16:07:28 +01:00
Aleksandra Martyniuk	e48ca62390	repair: add abort subscription to data sync task When node operation is aborted, same should happen with the corresponding task manager's repair task. Subscribe data_sync_repair_task_impl abort() to node_ops_info abort_source.	2022-12-19 15:57:35 +01:00
Aleksandra Martyniuk	2b35d7df1b	tasks: abort tasks on system shutdown When system shutdowns, all task manager's top level tasks are aborted. Responsibility for aborting child tasks is on their parents.	2022-12-19 15:57:35 +01:00
Botond Dénes	827cd0d37b	sstables: coroutinize sstable::load() It nicely simplified by it. No regression expected, this method is supposedly only used by tests and tools.	2022-12-19 09:33:52 -05:00
Raphael S. Carvalho	d9ab59043e	db: Add config for setting static number of compaction groups This new option allows user to control the number of compaction groups per table per shard. It's 0 by default which implies a single compaction group, as is today. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:16:24 -03:00
Raphael S. Carvalho	9cf4dc7b62	replica: Introduce static compaction groups This is the initial support for multiple groups. _x_log2_compaction_groups controls the number of compaction groups and the partitioning strategy within a single table. The value in _x_log2_compaction_groups refers to log base 2 of the actual number of groups. 0 means 1 compaction group. 1 means 2 groups and 2 most significant bits of token being used to pick the target group. The group partitioner should be later abstracted for making tablet integration easier in the future. _x_log2_compaction_groups is still a constant but a config option will come next. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:16:23 -03:00
Raphael S. Carvalho	c807e61715	test: sstable_test: Stop referencing single compaction group Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:16:20 -03:00
Raphael S. Carvalho	254c38c4d2	api: compaction_manager: Stop a compaction type for all groups Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:16:19 -03:00
Raphael S. Carvalho	4e836cb96c	api: Estimate pending tasks on all compaction groups Estimates # of compaction jobs to be performed on a table. Adaptation is done by adding estimation from all groups. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:16:17 -03:00
Raphael S. Carvalho	640436e72a	api: storage_service: Run maintenance compactions on all compaction groups Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:16:15 -03:00
Raphael S. Carvalho	e0c5cbee8d	replica: table: Adapt assertion to compaction groups Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:16:13 -03:00
Raphael S. Carvalho	d35cf88f09	replica: database: stop and disable compaction on behalf of all groups With compaction group model, truncate_table_on_all_shards() needs to stop and disable compaction for all groups. replica::table::as_table_state() will be removed once no user remains, as each table may map to multiple groups. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:16:12 -03:00
Raphael S. Carvalho	50b02ee0bd	replica: Introduce table::parallel_foreach_table_state() This will replace table::as_table_state(). The latter will be killed once its usage drops to zero. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:16:10 -03:00
Raphael S. Carvalho	fd69bd433e	replica: disable auto compaction on behalf of all groups Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:16:08 -03:00
Raphael S. Carvalho	6fefbe5706	replica: table: Rework compaction triggers for compaction groups Allow table-wide compaction trigger, as well as fine-grained trigger like after flushing a memtable on behalf of a single group. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:16:07 -03:00
Raphael S. Carvalho	6a6adea3ab	replica: Adapt table::get_sstables_including_compacted_undeleted() to compaction groups Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:16:05 -03:00
Raphael S. Carvalho	5919836da8	replica: Adapt table::rebuild_statistics() to compaction groups Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:16:04 -03:00
Raphael S. Carvalho	70b727db31	replica: table: Perform major compaction on behalf of all groups Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:16:01 -03:00
Raphael S. Carvalho	e3ccdb17a0	replica: table: Perform off-strategy compaction on behalf of all groups Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:16:00 -03:00
Raphael S. Carvalho	6efc9fd1f6	replica: table: Perform cleanup compaction on behalf of all groups Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:15:58 -03:00
Raphael S. Carvalho	36e11eb2a5	replica: Extend table::discard_sstables() to operate on all compaction groups discard_sstables() runs on context of truncate, which is a table-wide operation today, and will remain so with multiple static groups. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:15:55 -03:00
Raphael S. Carvalho	24c3687c3f	replica: table: Create compound sstable set for all groups Avoids extra compound set for single-compaction-group table. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:15:52 -03:00
Raphael S. Carvalho	eb620da981	replica: table: Set compaction strategy on behalf of all groups Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:15:50 -03:00
Raphael S. Carvalho	7a0e4f900f	replica: table: Return min memtable timestamp across all groups Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:15:49 -03:00
Raphael S. Carvalho	ceaa8a1ef1	replica: Adapt table::stop() to compaction groups Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:15:47 -03:00
Raphael S. Carvalho	facf923440	replica: Adapt table::clear() to compaction groups clear() clears memtable content and cache. Cache is shared by groups, therefore adaptation happens by only clearing memtables of all groups. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:15:45 -03:00
Raphael S. Carvalho	a9c902cd5e	replica: Adapt table::can_flush() to compaction groups can_flush() is used externally to determine if a table has an active memtable that can be flushed. Therefore, adaptation happens by returning true if any of the groups can be flushed. A subsequent flush request will flush memtable of all groups that are ready for it. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:15:44 -03:00
Raphael S. Carvalho	ea42090d47	replica: Adapt table::flush() to compaction groups Adaptation of flush() happens by trigger flush on memtable of all groups. table::seal_active_memtable() will bail out if memtable is empty, so it's not a problem to call flush on a group which memtable is empty. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:15:42 -03:00
Raphael S. Carvalho	7274c83098	replica: Introduce parallel_foreach_compaction_group() This variant will be useful when iterating through groups and performing async actions on each. It guarantees that all groups are alive by the time they're reached in the loop. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:15:40 -03:00
Raphael S. Carvalho	89ab9d7227	replica: Adapt table::set_schema() to compaction groups set_schema() is used by the database to apply schema changes to table components which include memtables. Adaptation happens by setting schema to memtable(s) of all groups. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:15:38 -03:00
Raphael S. Carvalho	0022322ae3	replica: Add memtables from all compaction groups for reads Let's add memtables of all compaction groups. Point queries are optimized by picking a single group. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:15:36 -03:00
Raphael S. Carvalho	e044001176	replica: Add memtable_count() method to compaction_group Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:15:34 -03:00
Raphael S. Carvalho	f2ea79f26c	replica: table: Reserve reader list capacity through a callback add_memtables_to_reader_list() will be adapted to compaction groups. For point queries, it will add memtables of a single group. With the callback, add_memtables_to_reader_list() can tell its caller the exact amount of memtable readers to be added, so it can reserve precisely the readers capacity. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:15:33 -03:00
Raphael S. Carvalho	e841508685	replica: Extract addition of memtables to reader list into a new function Will make it easier for adding memtables of all compaction groups. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:15:19 -03:00
Raphael S. Carvalho	530956b2de	replica: Adapt table::occupancy() to compaction groups table::occupancy() provides accumulated occupancy stats from memtables. Adaptation happens by accumulating stats from memtables of all groups. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:15:17 -03:00
Raphael S. Carvalho	ef8f542d75	replica: Adapt table::active_memtable() to compaction groups active_memtable() was fine to a single group, but with multiple groups, there will be one active memtable per group. Let's change the interface to reflect that. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:15:14 -03:00
Raphael S. Carvalho	429c5aa2f9	replica: Introduce table::compaction_groups() Useful for iterating through all groups. This is intermediary implementation which requires allocation as only one group is supported today. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:15:12 -03:00
Raphael S. Carvalho	514008f136	replica: Preparation for multiple compaction groups Adjusts scylla_memtables gdb command to multiple groups, while keeping backward compatibility. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:15:10 -03:00
Raphael S. Carvalho	52b94b6dd7	scylla-gdb: Fix backward compatibility of scylla_memtables command Fix it while refactoring the code for arrival of multiple compaction groups. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-19 11:15:07 -03:00
Anna Stuchlik	bbfb9556fc	doc: mark the in-memory tables feature as deprecated Closes #12286	2022-12-19 15:39:31 +02:00
Avi Kivity	c70a9b0166	test: make test xml filenames more unique `ea99750de7` ("test: give tests less-unique identifiers") made the disambiguating ids only be unambiguous within a single test case. This made all tests named "run" have the name name "run.1". Fix that by adding the suite name everywhere: in test paths, and in junit test case names. Fixes #12310. Closes #12313	2022-12-19 15:03:51 +02:00
Botond Dénes	3e6ddf21bc	Merge 'storage_service: unbootstrap: avoid unnecessary copy of ranges_to_stream' from Benny Halevy `ranges_to_stream` is a map of ` std::unordered_multimap<dht::token_range, inet_address>` per keyspace. On large clusters with a large number of keyspace, copying it may cause reactor stalls as seen in #12332 This series eliminates this copy by using std::move and also turns `stream_ranges` into a coroutine, adding maybe_yield calls to avoid further stalls down the road. Fixes #12332 Closes #12343 * github.com:scylladb/scylladb: storage_service: stream_ranges: unshare streamer storage_service: stream_ranges: maybe_yield storage_service: coroutinize stream_ranges storage_service: unbootstrap: move ranges_to_stream_by_keyspace to stream_ranges	2022-12-19 12:53:16 +02:00
Benny Halevy	e8aa1182b2	docs: replace-dead-node: get rid of hibernate status With replace using node operations, the HIBERNATE gossip status is not used anymore. This change updates documentation to reflect that. During replace, the replacing nodes shows in gossipinfo in STATUS:NORMAL. Also, the replaced node shows as DN in `nodetool status` while being replaced, so remove paragraph showing it's not listed in `nodetool status`. Plus. tidy up the text alignment. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-12-19 12:19:10 +02:00
Benny Halevy	c9993f020d	storage_service: get rid of handle_state_replacing Since `2ec1f719de` nodes no longer publish HIBERNATE state so we don't need to support handling it. Replace is now always done using node operations (using repair or streaming). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-12-19 12:19:08 +02:00
Benny Halevy	60de7d28db	storage_service: stream_ranges: unshare streamer Now that stream_ranges is a coroutine streamer can be an automatic variable on the coroutine stack frame. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-12-19 07:42:07 +02:00
Benny Halevy	9badcd56ca	storage_service: stream_ranges: maybe_yield Prevent stalls with a large number of keyspaces and token ranges. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-12-19 07:42:07 +02:00
Benny Halevy	2cf75319b0	storage_service: coroutinize stream_ranges Before adding maybe_yield calls. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-12-19 07:42:01 +02:00
Benny Halevy	82486bb5d2	storage_service: unbootstrap: move ranges_to_stream_by_keyspace to stream_ranges Avoid a potentially large memory copy causing a reactor stall with a large number of keyspaces. Fixes #12332 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-12-19 07:39:48 +02:00
Avi Kivity	7c7eb81a66	Merge 'Encapsulate filesystem access by sstable into filesystem_storage subsclass' from Pavel Emelyanov This is to define the API sstable needs from underlying storage. When implementing object-storage backend it will need to implement those. The API looks like future<> snapshot(const sstable& sst, sstring dir, absolute_path abs) const; future<> quarantine(const sstable& sst, delayed_commit_changes* delay); future<> move(const sstable& sst, sstring new_dir, generation_type generation, delayed_commit_changes* delay); void open(sstable& sst, const io_priority_class& pc); // runs in async context future<> wipe(const sstable& sst) noexcept; future<file> open_component(const sstable& sst, component_type type, open_flags flags, file_open_options options, bool check_integrity); It doesn't have "list" or alike, because it's not a method of an individual sstable, but rather the one from sstables_manager. It will come as separate PR. Closes #12217 * github.com:scylladb/scylladb: sstable, storage: Mark dir/temp_dir private sstable: Remove get_dir() (well, almost) sstable: Add quarantine() method to storage sstable: Use absolute/relative path marking for snapshot() sstable: Remove temp_... stuff from sstable sstable: Move open_component() on storage sstable: Mark rename_new_sstable_component_file() const sstable: Print filename(type) on open-component error sstable: Reorganize new_sstable_component_file() sstable: Mark filename() private sstable: Introduce index_filename() tests: Disclosure private filename() calls sstable: Move wipe_storage() on storage sstable: Remove temp dir in wipe_storage() sstable: Move unlink parts into wipe_storage sstable: Remove get_temp_dir() sstable: Move write_toc() to storage sstable: Shuffle open_sstable() sstable: Move touch_temp_dir() to storage sstable: Move move() to storage sstable: Move create_links() to storage sstable: Move seal_sstable() to storage sstable: Tossing internals of seal_sstable() sstable: Move remove_temp_dir() to storage sstable: Move create_links_common() to storage sstable: Move check_create_links_replay() to storage sstable: Remove one of create_links() overloads sstable: Remove create_links_and_mark_for_removal() sstable: Indentation fix after prevuous patch sstable: Coroutinize create_links_common() sstable: Rename create_links_common()'s "dir" argument sstable: Make mark_for_removal bool_class sstable, table: Add sstable::snapshot() and use in table::take_snapshot sstable: Move _dir and _temp_dir on filesystem_storage sstable: Use sync_directory() method test, sstable: Use component_basename in test sstables: Move read_{digest\|checksum} on sstable	2022-12-18 17:29:35 +02:00
Anna Stuchlik	6a8eb33284	docs: add the new upgade guide 2022.1 to 2022.2 to the index and the toctree	2022-12-16 17:13:50 +01:00
Anna Stuchlik	36f4ef2446	docs: add the index file for the new upgrage guide from 2022.1 to 2022.2	2022-12-16 17:11:25 +01:00
Anna Stuchlik	8d8983e029	docs: add the metrics update file to the upgrade guide 2022.1 to 2022.2	2022-12-16 17:09:21 +01:00
Anna Stuchlik	252c2139c2	docs: add the upgrade guide for ScyllaDB Enterprise from 2022.1 to 2022.2	2022-12-16 17:07:00 +01:00
Michał Chojnowski	b52bd9ef6a	db: commitlog: remove unused max_active_writes() Dead and misleading code. Closes #12327	2022-12-16 10:23:03 +02:00
Nadav Har'El	327539b15d	Merge 'test.py: fix cql failure handling' from Alecco Fix a bug in failure handling and log level. Closes #12336 * github.com:scylladb/scylladb: test.py: convert param to str test.py: fix error level for CQL tests	2022-12-16 09:29:21 +02:00
Botond Dénes	cc03becf82	Merge 'tasks: get task's type with method' from Aleksandra Martyniuk Type of operation is related to a specific implementation of a task. Then, it should rather be access with a virtual method in tasks::task_manager::task::impl than be its attribute. Closes #12326 * github.com:scylladb/scylladb: api: delete unused type parameter from task_manager_test api tasks: repair: api: remove type attribute from task_manager::task::status tasks: add type() method to task_manager::task::impl repair: add reason attribute to repair_task	2022-12-16 09:20:26 +02:00
Aleksandra Martyniuk	f81ad2d66a	repair: make shard tasks internal Shard tasks should not be visible to users by default, thus they are made internal. Closes #12325	2022-12-16 09:05:30 +02:00
Aleksandra Martyniuk	bae887da3b	tasks: add virtual destructor to task_manager::module When an object of a class inheriting from task_manager::module is destroyed, destructor of the derived class should be called. Closes #12324	2022-12-16 08:59:26 +02:00
Raphael S. Carvalho	e6fb3b3a75	compaction: Delete atomically off-strategy input sstables After commit `a57724e711`, off-strategy no longer races with view building, therefore deletion code can be simplified and piggyback on mechanism for deleting all sstables atomically, meaning a crash midway won't result in some of the files coming back to life, which leads to unnecessary work on restart. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #12245	2022-12-16 08:15:49 +02:00
Alejo Sanchez	9b65448d38	test.py: convert param to str The format_unidiff() function takes str, not pathlib PosixPath, so convert it to str. This prevented diff output of unexpected result to be shown in the log file. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-12-15 20:46:35 +01:00
Alejo Sanchez	5142d80bb1	test.py: fix error level for CQL tests If the test fails, use error log level. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-12-15 20:45:44 +01:00
Botond Dénes	64903ba7d5	test/cql-pytest: use pytest site-packages workaround Recently, the pytest script shipped by Fedora started invoking python with the `-s` flag, which disables python considering user site packages. This caused problems for our tests which install the cassandra driver in the user site packages. This was worked around in `e5e7780f32` by providing our own pytest interposer launcher script which does not pass the above mentioned flag to python. Said patch fixed test.py but not the run.py in cql-pytest. So if the cql-pytest suite is ran via test.py it works fine, but if it is invoked via the run script, it fails because it cannot find the cassandra driver. This patch patches run.py to use our own pytest launcher script, so the suite can be run via the run script as well. Since run.py is shared with the alternator pytest suite, this patch also fixes said test suite too. Closes #12253	2022-12-15 16:05:31 +02:00
Benny Halevy	639e247734	test: cql-pytest: test_describe: test_table_options_quoting: USE test_keyspace Without that, I often (but not always) get the following error: ``` __________________________ test_table_options_quoting __________________________ cql = <cassandra.cluster.Session object at 0x7f1aafb10650> test_keyspace = 'cql_test_1671103335055' def test_table_options_quoting(cql, test_keyspace): type_name = f"some_udt; DROP KEYSPACE {test_keyspace}" column_name = "col''umn -- @quoting test!!" comment = "table''s comment test!\"; DESC TABLES --quoting test" comment_plain = "table's comment test!\"; DESC TABLES --quoting test" #without doubling "'" inside comment > cql.execute(f"CREATE TYPE \"{type_name}\" (a int)") test/cql-pytest/test_describe.py:623: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ cassandra/cluster.py:2699: in cassandra.cluster.Session.execute ??? _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > ??? E cassandra.InvalidRequest: Error from server: code=2200 [Invalid query] message="No keyspace has been specified. USE a keyspace, or explicitly specify keyspace.tablename" ``` CQL driver in use ise the scylla driver version 3.25.10. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #12329	2022-12-15 14:35:33 +02:00
Aleksandra Martyniuk	f0b2b00a15	api: delete unused type parameter from task_manager_test api	2022-12-15 10:50:30 +01:00
Aleksandra Martyniuk	5bc09daa7a	tasks: repair: api: remove type attribute from task_manager::task::status	2022-12-15 10:49:09 +01:00
Aleksandra Martyniuk	8d5377932d	tasks: add type() method to task_manager::task::impl	2022-12-15 10:41:58 +01:00
Aleksandra Martyniuk	329176c7bc	repair: add reason attribute to repair_task As a preparation to creating a type() method in task_manager::task::impl a streaming::stream_reason is kept in repair_task.	2022-12-15 10:38:38 +01:00
Botond Dénes	9713a5c314	tool/scylla-sstable: move documentation online The inline-help of operations will only contain a short summary of the operation and the link to the online documentation. The move is not a straightforward copy-paste. First and foremost because we move from simple markdown to RST. Informal references are also replaced with proper RST links. Some small edits were also done on the texts. The intent is the following: * the inline help serves as a quick reference for what the operation does and what flags it has; * the online documentation serves as the full reference manual, explaining all details;	2022-12-15 04:10:21 -05:00
Botond Dénes	3cf7afdf95	docs: scylla-sstable.rst: add sstable content section Provides a link to the architecture/sstable page for more details on the sstable format itself. It also describes the mutation-fragment stream, the parts of it that is relevant to the sstable operations. The purpose of this section is to provide a target for links that want to point to a common explanation on the topic. In particular, we will soon move the detailed documentation of the scylla-sstable operations into this file and we want to have a common explanation of the mutation fragment stream that these operations can point to.	2022-12-15 04:10:21 -05:00
Botond Dénes	641fb4c8bb	docs: scylla-{sstable,types}.rst: drop Syntax section In both files, the section hierarchy is as follows: Usage Syntax Sections with actual content This scheme uses up 3 levels of hierarchy, leaving not much room to expand the sections with actual content with subsections of their own. Remove the Syntax level altogether, directly embedding the sections with content under the Usage section.	2022-12-15 04:03:00 -05:00
Botond Dénes	8f8284783a	Merge 'Fix handling of non-full clustering keys in the read path' from Tomasz Grabiec This PR fixes several bugs related to handling of non-full clustering keys. One is in trim_clustering_row_ranges_to(), which is broken for non-full keys in reverse mode. It will trim the range to position_in_partition_view::after_key(full_key) instead of position_in_partition_view::before_key(key), hence it will include the key in the resulting range rather than exclude it. Fixes #12180 after_key() was creating a position which is after all keys prefixed by a non-full key, rather than a position which is right after that key. This will issue will be caught by cql_query_test::test_compact_storage in debug mode when mutation_partition_v2 merging starts inserting sentinels at position after_key() on preemption. It probably already causes problems for such keys as after_key() is used in various parts in the read path. Refs #1446 Closes #12234 * github.com:scylladb/scylladb: position_in_partition: Make after_key() work with non-full keys position_in_partition: Introduce before_key(position_in_partition_view) db: Fix trim_clustering_row_ranges_to() for non-full keys and reverse order types: Fix comparison of frozen sets with empty values	2022-12-15 10:47:12 +02:00
Pavel Emelyanov	6d10a3448b	sstable, storage: Mark dir/temp_dir private Now all storage access via sstable happens with the help of storage class API so its internals can be finally made private. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:14:49 +03:00
Pavel Emelyanov	6296ca3438	sstable: Remove get_dir() (well, almost) The sstable::get_dir() is now gone, no callers know that sstable lives in any path on a filesystem. There are only few callers left. One is several places in code that need sstable datafile, toc and index paths to print them in logs. The other one is sstable_directory that is to be patched separately. For both there's a storage.prefix() method that prepends component name with where the sstable is "really" located. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:14:49 +03:00
Pavel Emelyanov	7402787d16	sstable: Add quarantine() method to storage Moving sstable to quarantine has some specific -- if the sstable is in staging/ directory it's anyway moved into root/quarantine dir, not into the quarantine subdir of its current location. Encapsulate this feture in storage class method. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:14:49 +03:00
Pavel Emelyanov	f507271578	sstable: Use absolute/relative path marking for snapshot() The snapshotting code uses full paths to files to manipulate snapshotted sstables. Until this code is patched to use some proper snapshotting API from sstable/ module, it will continue doing so. Nowever, to remove the get_dir() method from sstable() the seal_sstable() needs to put relative "backup" directory to storage::snapshot() method. This patch adds a temporary bool_class for this distinguishing. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:14:49 +03:00
Pavel Emelyanov	a46d378bee	sstable: Remove temp_... stuff from sstable There's a bunch of helpers around XFS-specific temp-dir sitting in publie sstable part. Drop it altogether, no code needs it for real. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:14:49 +03:00
Pavel Emelyanov	adba24d8ae	sstable: Move open_component() on storage Obtaining a class file object to read/write sstable from/to is now storage-specific. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:14:49 +03:00
Pavel Emelyanov	4c22831d23	sstable: Mark rename_new_sstable_component_file() const It's in fact such. Next patch will need it const to call this method via const sstable reference. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:14:49 +03:00
Pavel Emelyanov	6bf3e3a921	sstable: Print filename(type) on open-component error The file path is going to disappear soon, so print the filename() on error. For now it's the same, but the meaning of the filename() returning string is changing to become "random label for the log reader". Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:14:49 +03:00
Pavel Emelyanov	dc72bce6d7	sstable: Reorganize new_sstable_component_file() The helper consists of three stages: 1. open a file (probably in a temp dir) 2. decorate it with extentions and checked_file 3. optionally rename a file from temp dir The latter is done to trigger XFS allocate this file in separate block group if the file was created in temp dir on step 1. This patch swaps steps 2 and 3 to keep filesystem-specific opening next to each other. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:14:49 +03:00
Pavel Emelyanov	e55c740f49	sstable: Mark filename() private From now on no callers should use this string to access anything on disk Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:14:49 +03:00
Pavel Emelyanov	5f579eb405	sstable: Introduce index_filename() Currently the sstable::filename(Index) is used in several places that get the filename as a printable or throwable string and don't treat is as a real location of any file. For those, add the index_filename() helper symmetrical to toc_filename() and (in some sense) the get_filename() one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:14:49 +03:00
Pavel Emelyanov	bbbbd6dbfc	tests: Disclosure private filename() calls The sstable::filename() is going to become private method. Lots of tests call it, but tests do call a lot of other sstable private methods, that's OK. Make the sstable::filename() yet another one of that kind in advance. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:14:49 +03:00
Pavel Emelyanov	4a91f3d443	sstable: Move wipe_storage() on storage Now when the filesystem cleaning code is sitting in one method, it can finally be made the storage class one. Exception-safe allocation of toc_name (spoiler: it's copied anyway one step later, so it's "not that safe" actually) is moved into storage as well. The caller is left with toc_filename() call in its exception handler. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:14:49 +03:00
Pavel Emelyanov	c92d45eaa9	sstable: Remove temp dir in wipe_storage() When unlinking an sstable for whatever reason it's good to check if the temp dir is handing around. In some cases it's not (compaction), but keeping the whole wiping code together makes it easier to move it on storage class in one go. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:14:49 +03:00
Pavel Emelyanov	88ede71320	sstable: Move unlink parts into wipe_storage Just move the code. This is to make the next patch smaller. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:14:49 +03:00
Pavel Emelyanov	0336cb3bdd	sstable: Remove get_temp_dir() Only one private called of it left, it's better to open-code it there Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:14:49 +03:00
Pavel Emelyanov	3326063b8b	sstable: Move write_toc() to storage This method initiates the sstable creation. Effectively it's the first step in sstable creation transaction implemented on top of rename() call. Thus this method is moved onto storage under respective name. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:14:49 +03:00
Pavel Emelyanov	636d49f1c1	sstable: Shuffle open_sstable() When an sstable is prepared to be written on disk the .write_toc() is called on it which created temporary toc file. Prior to this, the writer code calls generate_toc() to collect components on the sstable. This patch adds the .open_sstable() API call that does both. This prepares the write_toc() part to be moved to storage, because it's not just "write data into TOC file", it's the first step in transaction implemeted on top of rename()s. The test need care -- there's rewrite_toc_without_scylla_component() thing in utils that doesn't want the generate_toc() part to be called. It's not patched here and continues calling .write_toc(). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:14:49 +03:00
Pavel Emelyanov	d3216b10d6	sstable: Move touch_temp_dir() to storage The continuation of the previously moved remove_temp_dir() one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:14:49 +03:00
Pavel Emelyanov	1a34cb98fc	sstable: Move move() to storage The sstable can be "moved" in two cases -- to move from staging or to move to quarantine. Both operation are sstable API ones, but the implementation is storage-specific. This patch makes the latter a method of storage class. One thing to note is that only quarantine() touched the target directly. Now also the move_to_new_dir() happenning on load also does it, but that's harmless. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:14:47 +03:00
Pavel Emelyanov	18f6165993	sstable: Move create_links() to storage This method is currently used in two places: sstable::snapshot() and sstable::seal_sstable(). The latter additionally touches the target backup/ subdir. This patch moves the whole thing on storage and adds touch for all the cases. For snapshots this might be excessive, but harmless. Tests get their private-disclosure way to access sstable._storage in few places to call create_links directly. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:13:45 +03:00
Pavel Emelyanov	136a8681e0	sstable: Move seal_sstable() to storage Now the sstable sealing is split into storage part, internal-state part and the seal-with-backup kick. This move makes remove_temp_dir() private. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:13:45 +03:00
Pavel Emelyanov	334d231f56	sstable: Tossing internals of seal_sstable() There are two of them -- one API call and the other one that just "seals" it. The latter one also changes the _marked_for_deletion bit on the sstable. This patch makes the latter method prepared to be moved onto storage, because sealing means comitting TOC file on disk with the help of rename system call which is purely storage thing. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:13:45 +03:00
Pavel Emelyanov	ce3a8a4109	sstable: Move remove_temp_dir() to storage This one is simple, it just accesses _temp_dir thing. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:13:45 +03:00
Pavel Emelyanov	9027d137d2	sstable: Move create_links_common() to storage Same as previous patch. This move makes the previously moved check_create_links_replay() a private method of the storage class. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:13:45 +03:00
Pavel Emelyanov	990032b988	sstable: Move check_create_links_replay() to storage It needs to get sstable const reference to get the filename(s) from it. Other than that it's pure filesystem-accessing method. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:13:45 +03:00
Pavel Emelyanov	041a8c80ad	sstable: Remove one of create_links() overloads There are two -- one that accepts generation and the other one that does not. The latter is only called by the former, so no need in keeping both. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:13:45 +03:00
Pavel Emelyanov	f1558b6988	sstable: Remove create_links_and_mark_for_removal() There's only one user of it, it can document its "and mark for removal" intention via dedicated bool_class argument. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:13:45 +03:00
Pavel Emelyanov	65f40b28e6	sstable: Indentation fix after prevuous patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:13:45 +03:00
Pavel Emelyanov	428adda4a9	sstable: Coroutinize create_links_common() Looks much shorter and easier-to-patch this way. The dst_dir argument is made value from const reference, old code copied it with do_with() anyway. Indentation is deliberately left broken until next patch. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:13:45 +03:00
Pavel Emelyanov	ab13a99586	sstable: Rename create_links_common()'s "dir" argument The whole method is going to move onto newly introduced filesystem_storage that already has field of the same name onboard. To avoid confusion, rename the argument to dst_dir. No functional changes, _just_ s/dir/dst_dir/g throughout the method. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:13:45 +03:00
Pavel Emelyanov	4977c73163	sstable: Make mark_for_removal bool_class Its meaning is comment-documented anyway. Also, next patches will remove the create_links_and_mark_for_removal() so callers need some verbose meaning of this boolean in advance. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:13:45 +03:00
Pavel Emelyanov	f53d6804a6	sstable, table: Add sstable::snapshot() and use in table::take_snapshot The replica/ code now "knows" that snapshotting an sstable means creating a bunch of hard-links on disk. Abstract that via sstable::snapshot() method. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:13:44 +03:00
Pavel Emelyanov	2803dcda6d	sstable: Move _dir and _temp_dir on filesystem_storage Those two fields define the way sstable is stored as collection of on-disk files. First step towards making the storage access abstract is in moving the paths onto filesystem_storage embedded class. Both are made public for now, the rest of the code is patched to access them via _storage.<smth>. The rest of the set moves parts of sstable:: methods into the filesystem_storage, then marks the paths private. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:13:44 +03:00
Pavel Emelyanov	17c8ba6034	sstable: Use sync_directory() method The sstable::write_toc() executes sync_directory() by hand. Better to use the method directly. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:13:44 +03:00
Pavel Emelyanov	e934f42402	test, sstable: Use component_basename in test One case gets full sstable datafile path to get the basename from it. There's already the basename helper on the class sstable. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:13:44 +03:00
Pavel Emelyanov	376915d406	sstables: Move read_{digest\|checksum} on sstable These methods access sstables as files on disk, in order to hide the "path on filesystem" meaning of sstables::filename() the whole method should be made sstable:: one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-15 10:13:44 +03:00
Pavel Emelyanov	d561495f0d	Merge 'topology: get rid of pending state' from Benny Halevy Now, with `a44ca06906`, is_normal_token_owner that replaced is_member does not rely anymore on the pending status of endpoints in topology. With that we can get rid of this state and just keep all endpoints we know about in the topology. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #12294 * github.com:scylladb/scylladb: topology: get rid of pending state topology: debug log update and remove endpoint	2022-12-14 19:28:35 +03:00
Benny Halevy	bdb6550305	view: row_locker: add latency_stats_tracker Refactor the existing stats tracking and updating code into struct latency_stats_tracker and while at it, count lock_acquisitions only on success. Decrement operations_currently_waiting_for_lock in the destructor so it's always balanced with the uncoditional increment in the ctor. As for updating estimated_waiting_for_lock, it is always updated in the dtor, both on success and failure since the wait for the lock happened, whether waiting timed out or not. Fixes #12190 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #12225	2022-12-14 17:37:22 +02:00
Avi Kivity	9ee78975b7	Merge 'Fix topology mismatch on read-repair handler creation' from Pavel Emelyanov The schedule_repair() receives a bunch of endpoint:mutations pairs and tries to create handlers for those. When creating the handlers it re-obtains topology from schema->ks->effective_replication_map chain, but this new topology can be outdated as compared to the list of endpoints at hand. The fix is to carry the e.r.m. pointer used by read executor reconciliation all the way down to repair handlers creation. This requires some manipulations with mutate_internal() and mutate_prepare() argument lists. fixes: #12050 (it was the same problem) Closes #12256 * github.com:scylladb/scylladb: proxy: Carry replication map with repair mutation(s) proxy: Wrap read repair entries into read_repair_mutation proxy: Turn ref to forwardable ref in mutations iterator	2022-12-14 17:33:43 +02:00
Tomasz Grabiec	23e4c83155	position_in_partition: Make after_key() work with non-full keys This fixes a long standing bug related to handling of non-full clustering keys, issue #1446. after_key() was creating a position which is after all keys prefixed by a non-full key, rather than a position which is right after that key. This will issue will be caught by cql_query_test::test_compact_storage in debug mode when mutation_partition_v2 merging starts inserting sentinels at position after_key() on preemption. It probably already causes problems for such keys.	2022-12-14 14:47:33 +01:00
Botond Dénes	16c50bed5e	Merge 'sstables: coroutinize update_info_for_opened_data' from Avi Kivity A complicated function (in continuation style) that benefits from this simplification. Closes #12289 * github.com:scylladb/scylladb: sstables: update_info_for_opened_data: reindent sstables: update_info_for_opened_data: coroutinize	2022-12-14 15:12:22 +02:00
Nadav Har'El	92d03be37b	materialized view: fix bug in some large modifications to base partitions Sometimes a single modification to a base partition requires updates to a large number of view rows. A common example is deletion of a base partition containing many rows. A large BATCH is also possible. To avoid large allocations, we split the large amount of work into batch of 100 (max_rows_for_view_updates) rows each. The existing code assumed an empty result from one of these batches meant that we are done. But this assumption was incorrect: There are several cases when a base-table update may not need a view update to be generated (see can_skip_view_updates()) so if all 100 rows in a batch were skipped, the view update stopped prematurely. This patch includes two tests showing when this bug can happen - one test using a partition deletion with a USING TIMESTAMP causing the deletion to not affect the first 100 rows, and a second test using a specially-crafed large BATCH. These use cases are fairly esoteric, but in fact hit a user in the wild, which led to the discovery of this bug. The fix is fairly simple: To detect when build_some() is done it is no longer enough to check if it returned zero view-update rows; Rather, it explicitly returns whether or not it is done as an std::optional. The patch includes several tests for this bug, which pass on Cassandra, failed on Scylla before this patch, and pass with this patch. Fixes #12297. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12305	2022-12-14 14:50:38 +02:00
Botond Dénes	e7d8855675	Merge 'Revert accidental submodule updates' from Benny Halevy The abseil and tools/java submodules were accidentally updated in `71bc12eecc` (merged to master in `51f867339e`) This series reverts those changes. Closes #12311 * github.com:scylladb/scylladb: Revert accidental update of tools/java submodule Revert accidental update of abseil submodule	2022-12-14 13:20:08 +02:00
Benny Halevy	865193f99a	Revert accidental update of tools/java submodule The tools/java submodule was accidentally updated in `71bc12eecc` Revert this change. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-12-14 13:06:30 +02:00
Benny Halevy	9911ba195b	Revert accidental update of abseil submodule The abseil module was accidentally updated in `71bc12eecc` Revert this change. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-12-14 13:05:04 +02:00
Pavel Emelyanov	ab8fc0e166	proxy: Carry replication map with repair mutation(s) The create_write_response_handler() for read repair needs the e.r.m. from the caller, because it effectively accepts list of endpoints from it. So this patch equips all read_repair_mutation-s with the e.r.m. pointer so that the handler creation can use it. It's the same for all mutations, so it's a waste of space, but it's not bad -- there's typically few mutations in this range and the entry passed there is temporary, so even lots of them won't occupy lots of memory for long. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-14 14:03:39 +03:00
Pavel Emelyanov	140f373e15	proxy: Wrap read repair entries into read_repair_mutation The schedule_repair() operates on a map of endpoint:mutations pairs. Next patch will need to extend this entry and it's going to be easier if the entry is wrapped in a helper structure in advance. This is where the forwardable reference cursor from the previous patch gets its user. The schedule_repair() produces a range of rvalue wrappers, but the create_write_response_handler accepting it is OK, it copies mutations anyway. The printing operator is added to facilitate mutations logging from mutate_internal() method. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-14 14:01:12 +03:00
Pavel Emelyanov	014b563ef1	proxy: Turn ref to forwardable ref in mutations iterator The mutate_prepare() is iterating over range of mutation with 'auto&' cursor thus accepting only lvalues. This is very restrictive, the caller of mutate_prepare() may as well provide rvalues if the target create_write_response_handler() or lambda accepts it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-14 14:00:10 +03:00
Avi Kivity	3fa230fee4	Merge 'cql3: expr: make it possible to prepare and evaluate conjunctions' from Jan Ciołek This PR implements two things: * Getting the value of a conjunction of elements separated by `AND` using `expr::evaluate` * Preparing conjunctions using `prepare_expression` --- `NULL` is treated as an "unkown value" - maybe `true` maybe `false`. `TRUE AND NULL` evaluates to `NULL` because it might be `true` but also might be `false`. `FALSE AND NULL` evaluates to `FALSE` because no matter what value `NULL` acts as, the result will still be `FALSE`. Unset and empty values are not allowed. Usually in CQL the rule is that when `NULL` occurs in an operation the whole expression becomes `NULL`, but here we decided to deviate from this behavior. Treating `NULL` as an "unkown value" is the standard SQL way of handing `NULLs` in conjunctions. It works this way in MySQL and Postgres so we do it this way as well. The evaluation short-circuits. Once `FALSE` is encountered the function returns `FALSE` immediately without evaluating any further elements. It works this way in Postgres as well, for example: `SELECT true AND NULL AND 1/0 = 0` will throw a division by zero error, but `SELECT false AND 1/0 = 0` will successfully evaluate to `FALSE`. Closes #12300 * github.com:scylladb/scylladb: expr_test: add unit tests for prepare_expression(conjunction) cql3: expr: make it possible to prepare conjunctions expr_test: add tests for evaluate(conjunction) cql3: expr: make it possible to evaluate conjunctions	2022-12-14 09:48:26 +02:00
Botond Dénes	122b267478	Merge 'repair: coroutinize to_repair_rows_list' from Avi Kivity Simplify a somewhat complicated function. Closes #12290 * github.com:scylladb/scylladb: repair: to_repair_rows_list: reindent repair: to_repair_rows_list: coroutinize	2022-12-14 09:39:47 +02:00
Avi Kivity	c09583bcef	storage_proxy: coroutinize send_truncate_blocking Not particularly important, but a small simplification. Closes #12288	2022-12-14 09:39:33 +02:00
Tomasz Grabiec	132d5d4fa1	messaging: Shutdown on stop() if it wasn't shut down earlier All rpc::client objects have to be stopped before they are destroyed. Currently this is done in messaging_service::shutdown(). The cql_test_env does not call shutdown() currently. This can lead to use-after-free on the rpc::client object, manifesting like this: Segmentation fault on shard 0. Backtrace: column_mapping::~column_mapping() at schema.cc:? db::cql_table_large_data_handler::internal_record_large_cells(sstables::sstable const&, sstables::key const&, clustering_key_prefix const, column_definition const&, unsigned long, unsigned long) const at ./db/large_data_handler.cc:180 operator() at ./db/large_data_handler.cc:123 (inlined by) seastar::future<void> std::__invoke_impl<seastar::future<void>, db::cql_table_large_data_handler::cql_table_large_data_handler(gms::feature_service&, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>)::$_1&, sstables::sstable const&, sstables::key const&, clustering_key_prefix const, column_definition const&, unsigned long, unsigned long>(std::__invoke_other, db::cql_table_large_data_handler::cql_table_large_data_handler(gms::feature_service&, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>)::$_1&, sstables::sstable const&, sstables::key const&, clustering_key_prefix const&&, column_definition const&, unsigned long&&, unsigned long&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/invoke.h:61 (inlined by) std::enable_if<is_invocable_r_v<seastar::future<void>, db::cql_table_large_data_handler::cql_table_large_data_handler(gms::feature_service&, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>)::$_1&, sstables::sstable const&, sstables::key const&, clustering_key_prefix const, column_definition const&, unsigned long, unsigned long>, seastar::future<void> >::type std::__invoke_r<seastar::future<void>, db::cql_table_large_data_handler::cql_table_large_data_handler(gms::feature_service&, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>)::$_1&, sstables::sstable const&, sstables::key const&, clustering_key_prefix const, column_definition const&, unsigned long, unsigned long>(db::cql_table_large_data_handler::cql_table_large_data_handler(gms::feature_service&, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>)::$_1&, sstables::sstable const&, sstables::key const&, clustering_key_prefix const&&, column_definition const&, unsigned long&&, unsigned long&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/invoke.h:114 (inlined by) std::_Function_handler<seastar::future<void> (sstables::sstable const&, sstables::key const&, clustering_key_prefix const, column_definition const&, unsigned long, unsigned long), db::cql_table_large_data_handler::cql_table_large_data_handler(gms::feature_service&, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>, utils::updateable_value<unsigned int>)::$_1>::_M_invoke(std::_Any_data const&, sstables::sstable const&, sstables::key const&, clustering_key_prefix const&&, column_definition const&, unsigned long&&, unsigned long&&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/std_function.h:290 std::function<seastar::future<void> (sstables::sstable const&, sstables::key const&, clustering_key_prefix const, column_definition const&, unsigned long, unsigned long)>::operator()(sstables::sstable const&, sstables::key const&, clustering_key_prefix const, column_definition const&, unsigned long, unsigned long) const at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/std_function.h:591 (inlined by) db::cql_table_large_data_handler::record_large_cells(sstables::sstable const&, sstables::key const&, clustering_key_prefix const, column_definition const&, unsigned long, unsigned long) const at ./db/large_data_handler.cc:175 seastar::rpc::log_exception(seastar::rpc::connection&, seastar::log_level, char const, std::__exception_ptr::exception_ptr) at ./build/release/seastar/./seastar/src/rpc/rpc.cc:109 operator() at ./build/release/seastar/./seastar/src/rpc/rpc.cc:788 operator() at ./build/release/seastar/./seastar/include/seastar/core/future.hh:1682 (inlined by) void seastar::futurize<seastar::future<void> >::satisfy_with_result_of<seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::rpc::client::client(seastar::rpc::logger const&, void, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14>(seastar::rpc::client::client(seastar::rpc::logger const&, void, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, seastar::rpc::client::client(seastar::rpc::logger const&, void, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14&, seastar::future_state<seastar::internal::monostate>&&)#1}::operator()(seastar::internal::promise_base_with_type<void>&&, seastar::rpc::client::client(seastar::rpc::logger const&, void, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14&, seastar::future_state<seastar::internal::monostate>&&) const::{lambda()#1}>(seastar::internal::promise_base_with_type<void>&&, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::rpc::client::client(seastar::rpc::logger const&, void, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14>(seastar::rpc::client::client(seastar::rpc::logger const&, void, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, seastar::rpc::client::client(seastar::rpc::logger const&, void, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14&, seastar::future_state<seastar::internal::monostate>&&)#1}::operator()(seastar::internal::promise_base_with_type<void>&&, seastar::rpc::client::client(seastar::rpc::logger const&, void, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14&, seastar::future_state<seastar::internal::monostate>&&) const::{lambda()#1}&&) at ./build/release/seastar/./seastar/include/seastar/core/future.hh:2134 (inlined by) operator() at ./build/release/seastar/./seastar/include/seastar/core/future.hh:1681 (inlined by) seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::rpc::client::client(seastar::rpc::logger const&, void, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::rpc::client::client(seastar::rpc::logger const&, void, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14>(seastar::rpc::client::client(seastar::rpc::logger const&, void, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, seastar::rpc::client::client(seastar::rpc::logger const&, void, seastar::rpc::client_options, seastar::socket, seastar::socket_address const&, seastar::socket_address const&)::$_14&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>::run_and_dispose() at ./build/release/seastar/./seastar/include/seastar/core/future.hh:781 seastar::reactor::run_tasks(seastar::reactor::task_queue&) at ./build/release/seastar/./seastar/src/core/reactor.cc:2319 (inlined by) seastar::reactor::run_some_tasks() at ./build/release/seastar/./seastar/src/core/reactor.cc:2756 seastar::reactor::do_run() at ./build/release/seastar/./seastar/src/core/reactor.cc:2925 seastar::reactor::run() at ./build/release/seastar/./seastar/src/core/reactor.cc:2808 seastar::app_template::run_deprecated(int, char, std::function<void ()>&&) at ./build/release/seastar/./seastar/src/core/app-template.cc:265 seastar::app_template::run(int, char, std::function<seastar::future<int> ()>&&) at ./build/release/seastar/./seastar/src/core/app-template.cc:156 operator() at ./build/release/seastar/./seastar/src/testing/test_runner.cc:75 (inlined by) void std::__invoke_impl<void, seastar::testing::test_runner::start_thread(int, char)::$_0&>(std::__invoke_other, seastar::testing::test_runner::start_thread(int, char)::$_0&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/invoke.h:61 (inlined by) std::enable_if<is_invocable_r_v<void, seastar::testing::test_runner::start_thread(int, char)::$_0&>, void>::type std::__invoke_r<void, seastar::testing::test_runner::start_thread(int, char)::$_0&>(seastar::testing::test_runner::start_thread(int, char)::$_0&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/invoke.h:111 (inlined by) std::_Function_handler<void (), seastar::testing::test_runner::start_thread(int, char)::$_0>::_M_invoke(std::_Any_data const&) at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/std_function.h:290 std::function<void ()>::operator()() const at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/std_function.h:591 (inlined by) seastar::posix_thread::start_routine(void*) at ./build/release/seastar/./seastar/src/core/posix.cc:73 Fix by making sure that shutdown() is called prior to destruction. Fixes #12244 Closes #12276	2022-12-14 10:28:26 +03:00
Tzach Livyatan	7cd613fc08	Docs: Improve wording on the os-supported page v2 Closes #11871	2022-12-14 08:59:26 +02:00
Botond Dénes	31fcfe62e1	Merge 'doc: add the description of AzureSnitch to the documentation' from Anna Stuchlik Fixes https://github.com/scylladb/scylladb/issues/11712 Updates added with this PR: - Added a new section with the description of AzureSnitch (similar to others + examples and language improvements). - Fixed the headings so that they render properly. - Replaced "Scylla" with "ScyllaDB". Closes #12254 * github.com:scylladb/scylladb: docs: replace Scylla with ScyllaDB on the Snitches page docs: fix the headings on the Snitches page doc: add the description of AzureSnitch to the documentation	2022-12-14 08:58:48 +02:00
Lubos Kosco	3f9dca9c60	doc: print out the generated UUID for sending to support Closes #12176	2022-12-14 08:57:54 +02:00
guy9	a329fcd566	Updated University monitoring lesson link Closes #11906	2022-12-14 08:50:26 +02:00
Jan Ciolek	9afa9f0e50	expr_test: add unit tests for prepare_expression(conjunction) Add unit tests which ensure that preparing conjunctions works as expected. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-12-13 20:23:17 +01:00
Jan Ciolek	dde86a2da6	cql3: expr: make it possible to prepare conjunctions prepare_expression used to throw an error when encountering a conjunction. Now it's possible to use prepare_expression to prepare an expression that contains conjunctions. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-12-13 20:23:17 +01:00
Jan Ciolek	5f5b1c4701	expr_test: add tests for evaluate(conjunction) Add unit tests which ensure that evaluating a conjunction behaves as expected. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-12-13 20:23:17 +01:00
Jan Ciolek	b3c16f6bc8	cql3: expr: make it possible to evaluate conjunctions Previously it was impossible to use expr::evaluate() to get the value of a conjunction of elements separated by ANDs. Now it has been implemented. NULL is treated as an "unkown value" - maybe true maybe false. `TRUE AND NULL` evaluates to NULL because it might be true but also might be false. `FALSE AND NULL` evaluates to FALSE because no matter what value NULL acts as, the result will still be FALSE. Unset and empty values are not allowed. Usually in CQL the rule is that when NULL occurs in an operation the whole expression becomes NULL, but here we decided to deviate from this behavior. Treating NULL as an "unkown value" is the standard SQL way of handing NULLs in conjunctions. It works this way in MySQL and Postgres so we do it this way as well. The evaluation short-circuits. Once FALSE is encountered the function returns FALSE immediately without evaluating any further elements. It works this way in Postgres as well, for example: `SELECT true AND NULL AND 1/0 = 0` will throw a division by zero error but `SELECT false AND 1/0 = 0` will successfully evaluate to FALSE. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-12-13 20:23:08 +01:00
Benny Halevy	e9e66f3ca7	database: drop_table_on_all_shards: limit truncated_at time The infinetely high time_point of `db_clock::time_point::max()` used in `ba42852b0e` is too high for some clients that can't represent that as a date_time string. Instead, limit it to 9999-12-31T00:00:00+0000, that is practically sufficient to ensure truncation of all sstables and should be within the clients' limits. Fixes #12239 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #12273	2022-12-13 16:46:20 +02:00
Avi Kivity	919888fe60	Merge 'docs/dev: Add backport instructions for contributors' from Jan Ciołek Add instructions on how to backport a feature to on older version of Scylla. It contains a detailed step-by-step instruction so that people unfamiliar with intricacies of Scylla's repository organization can easily get the hang of it. This is the guide I wish I had when I had to do my first backport. I put it in backport.md because that looks like the file responsible for this sort of information. For a moment I thought about `CONTRIBUTING.md`, but this is a really short file with general information, so it doesn't really fit there. Maybe in the future there will be some sort of unification (see #12126) Closes #12138 * github.com:scylladb/scylladb: dev/docs: add additional git pull to backport docs docs/dev: add a note about cherry-picking individual commits docs/dev: use 'is merged into' instead of 'becomes' docs/dev: mention that new backport instructions are for the contributor docs/dev: Add backport instructions for contributors	2022-12-13 16:27:04 +02:00
Pavel Emelyanov	fe4cf231bc	snitch: Check http response codes to be OK Several snitch drivers make http requests to get region/dc/zone/rack/whatever from the cloud provider. They blindly rely on the response being successfull and read response body to parse the data they need from. That's not nice, add checks for requests finish with http OK statuses. refs: #12185 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #12287	2022-12-13 14:49:18 +02:00
Benny Halevy	68141d0aac	topology: get rid of pending state Now, with `a44ca06906`, is_normal_token_owner that replaced is_member does not rely anymore on the pending status of endpoints in topology. With that we can get rid of this state and just keep all endpoints we know about in the topology. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-12-13 14:17:18 +02:00
Benny Halevy	f2753eba30	topology: debug log update and remove endpoint Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-12-13 14:17:13 +02:00
Avi Kivity	c7cee0da40	Merge 'storage_service: handle_state_normal: always update_topology before update_normal_tokens' from Benny Halevy update_normal_tokens checks that that the endpoint is in topology. Currently we call update_topology on this path only if it's not a normal_token_owner, but there are paths when the endpoint could be a normal token owner but still be pending in topology so always update it, just in case. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #12080 * github.com:scylladb/scylladb: storage_service: handle_state_normal: always update_topology before update_normal_tokens storage_service: handle_state_normal: delete outdated comment regarding update pending ranges race	2022-12-13 13:41:10 +02:00
Avi Kivity	75e469193b	Merge 'Use Host ID as Raft ID' from Kamil Braun Thanks to #12250, Host IDs uniquely identify nodes. We can use them as Raft IDs which simplifies the code and makes reasoning about it easier, because Host IDs are always guaranteed to be present (while Raft IDs may be missing during upgrade). Fixes: https://github.com/scylladb/scylladb/issues/12204 Closes #12275 * github.com:scylladb/scylladb: service/raft: raft_group0: take `raft::server_id` parameter in `remove_from_group0` gms, service: stop gossiping and storing RAFT_SERVER_ID Revert "gms/gossiper: fetch RAFT_SERVER_ID during shadow round" service: use HOST_ID instead of RAFT_SERVER_ID during replace service/raft: use gossiped HOST_ID instead of RAFT_SERVER_ID to update Raft address map main: use Host ID as Raft ID	2022-12-13 13:39:41 +02:00
Anna Stuchlik	7bc4385551	doc: specify the versions where Alternator TTL is no longer experimental	2022-12-13 11:25:24 +01:00
Andrii Patsula	cd2e786d72	Report a warning when a server's IP cannot be found in ping. Fixes #12156 Closes #12206	2022-12-13 11:18:59 +01:00
Botond Dénes	51f867339e	Merge 'Docs: cleanup add-node-to-cluster' from Benny Halevy This series improves the add-node-to-cluster document, in particular around the documentation for the associated cleanup procedure, and the prerequisite steps. It also removes information about outdated releases. Closes #12210 * github.com:scylladb/scylladb: docs: operating-scylla: add-node-to-cluster: deleted instructions for unsupported releases docs: operating-scylla: add-node-to-cluster: cleanup: move tips to a note docs: operating-scylla: add-node-to-cluster: improve wording of cleanup instructions docs: operating-scylla: prerequisites: system_auth is a keyspace, not a table docs: operating-scylla: prerequisites: no Authetication status is gathered docs: operating-scylla: prerequisites: simplify grep commands docs: operating-scylla: add-node-to-cluster: prerequisites: number sub-sections docs: operating-scylla: add-node-to-cluster: describe other nodes in plural	2022-12-13 10:54:05 +02:00
Botond Dénes	4122854ae7	Merge 'repair: coroutinize repair_range' from Avi Kivity Nicer and simpler, but essentially cosmetic. Closes #12235 * github.com:scylladb/scylladb: repair: reindent repair_range repair: coroutinize repair_range	2022-12-13 08:16:05 +02:00
Avi Kivity	96890d4120	repair: to_repair_rows_list: reindent	2022-12-12 22:54:07 +02:00
Avi Kivity	e482cb1764	repair: to_repair_rows_list: coroutinize Simplifying a complicated function. It will also be a little faster due to fewer allocations, but not significantly.	2022-12-12 22:52:12 +02:00
Avi Kivity	c728de8533	sstables: update_info_for_opened_data: reindent Recover much-needed indent levels for future use.	2022-12-12 22:38:07 +02:00
Avi Kivity	eace9a226c	sstables: update_info_for_opened_data: coroutinize Nothing special, just simplifying a complicated function.	2022-12-12 22:35:46 +02:00
Michał Jadwiszczak	5985f22841	version: Reverse version increase Revert version change made by PR #11106, which increased it to `4.0.0` to enable server-side describe on latest cqlsh. Turns out that our tooling some way depends on it (eg. `sstableloader`) and it breaks dtests. Reverting only the version allows to leave the describe code unchanged and it fixes the dtests. cqlsh 6.0.0 will return a warning when running `DESC ...` commands. Closes #12272	2022-12-12 18:45:32 +02:00
Kamil Braun	a26f62b37b	service/raft: raft_group0: take `raft::server_id` parameter in `remove_from_group0` We no longer need to translate from IP to Raft ID using the address map, because Raft ID is now equal to the Host ID - which is always available at the call site of `remove_from_group0`.	2022-12-12 15:23:05 +01:00
Kamil Braun	bf6679906f	gms, service: stop gossiping and storing RAFT_SERVER_ID It is equal to (if present) HOST_ID and no longer used for anything. The application state was only gossiped if `experimental-features` contained `raft`, so we can free this slot. Similarly, `raft_server_id`s were only persisted in `system.peers` if the `SUPPORTS_RAFT` cluster feature was enabled, which happened only when `experimental-features` contained `raft`. The `raft_server_id` field in the schema was also introduced recently in `master` and didn't get to be in a release yet. Given either of these reasons, we can remove this field safely.	2022-12-12 15:20:30 +01:00
Kamil Braun	5dbe236339	Revert "gms/gossiper: fetch RAFT_SERVER_ID during shadow round" This reverts commit `60217d7f50`. We no longer need RAFT_SERVER_ID.	2022-12-12 15:20:20 +01:00
Kamil Braun	3e58da0719	service: use HOST_ID instead of RAFT_SERVER_ID during replace Makes the code simpler because we can assume that HOST_ID is always there.	2022-12-12 15:18:56 +01:00
Kamil Braun	32c56920b4	service/raft: use gossiped HOST_ID instead of RAFT_SERVER_ID to update Raft address map With the earlier commit, if gossiped RAFT_SERVER_ID is not empty then it's the same as HOST_ID.	2022-12-12 15:16:56 +01:00
Calle Wilund	e99626dc10	config: Change wording of "none" in encryption options to maybe reduce user confusion Fixes /scylladb/scylla-enterprise/issues#1262 Changes the somewhat ambiguous "none" into "not set" to clarify that "none" is not an option to be written out, but an absense of a choice (in which case you also have made a choice). Closes #12270	2022-12-12 16:14:53 +02:00
Kamil Braun	f3243ff674	main: use Host ID as Raft ID The Host ID now uniquely identifies a node (we no longer steal it during node replace) and Raft is still experimental. We can reuse the Host ID of a node as its Raft ID. This will allow us to remove and simplify a lot of code. With this we can already remove some dead code in this commit.	2022-12-12 15:14:51 +01:00
Botond Dénes	d44c5f5548	scripts: add open-coredump.sh Script for "one-click" opening of coredumps. It extracts the build-id from the coredump, retrieves metadata for that build, downloads the binary package, the source code and finally launches the dbuild container, with everything ready to load the coredump. The script is idempotent: running it after the prepartory steps will re-use what is already donwloaded. The script is not trying to provide a debugging environment that caters to all the different ways and preferences of debugging. Instead, it just sets up a minimalistic environment for debugging, while providing opportunities for the user to customization according to their preferred. I'm not entirely sure, coredumps from master branch will work, but we can address this later when we confirm they don't. Example: $ ~/ScyllaDB/scylla/worktree0/scripts/open-coredump.sh ./core.scylla.113.bac3650b616f4f09a4d1ab160574b6a5.4349.1669185225000000000000 Build id: 5009658b834aaf68970135bfc84f964b66ea4dee Matching build is scylla-5.0.5 0.20221009.5a97a1060 release-x86_64 Downloading relocatable package from http://downloads.scylladb.com/downloads/scylla/relocatable/scylladb-5.0/scylla-x86_64-package-5.0.5.0.20221009.5a97a1060.tar.gz Extracting package scylla-x86_64-package-5.0.5.0.20221009.5a97a1060.tar.gz Cloning scylla.git Downloading scylla-gdb.py Copying scylla-gdb.py from /home/bdenes/ScyllaDB/storage/11961/open-coredump.sh.dir/scylla.repo Launching dbuild container. To examine the coredump with gdb: $ gdb -x scylla-gdb.py -ex 'set directories /src/scylla' --core ./core.scylla.113.bac3650b616f4f09a4d1ab160574b6a5.4349.1669185225000000000000 /opt/scylladb/libexec/scylla See https://github.com/scylladb/scylladb/blob/master/docs/dev/debugging.md for more information on how to debug scylla. Good luck! [root@fedora workdir]# Closes #12223	2022-12-12 12:55:28 +02:00
Kamil Braun	dcba652013	Merge 'replacenode: do not inherit host_id' from Benny Halevy We want to always be able to distinguish between the replacing node and the replacee by using different, unique, host identifiers. This will allow us to use the host_id authoritatively to identify the node (rather then its endpoint ip address) for token mapping and node operations. Also, it will be used in the following patch to never allow the replaced node to rejoin the cluster, as its host_id should never be reused. This change does not affect #5523, the replaced node may still steal back its tokens if restarted. Refs #9839 Refs #12040 Closes #12250 * github.com:scylladb/scylladb: docs: replace-dead-node: update host_id of replacing node docs: replace-dead-node: fix alignment db: system_keyspace: change set_local_host_id to private set_local_random_host_id storage_service: do not inherit the host_id of a replaced a node	2022-12-12 11:00:42 +01:00
Benny Halevy	c6f05b30e1	task_manager: task: impl: add virtual destructor The generic task holds and destroyes a task::impl but we want the derived class's destructor to be called when the task is destroyed otherwise, for example, member like abort_source subscription will not be destroyed (and auto-unlinked). Fixes #12183 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #12266	2022-12-11 22:10:59 +02:00
Benny Halevy	36a9f62833	repair: repair_module: use mutable capture for func It is moved into the async thread so the encapsulating function should be defined mutable to move the func rather thna copying it. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #12267	2022-12-11 22:10:28 +02:00
Nadav Har'El	0c26032e70	test/cql-pytest: translate more Cassandra tests This patch includes a translation of two more test files from Cassandra's CQL unit test directory cql3/validation/operations. All tests included here pass on Cassandra. Several test fail on Scylla and are marked "xfail". These failures discovered two previously-unknown bugs: #12243: Setting USING TTL of "null" should be allowed #12247: Better error reporting for oversized keys during INSERT And also added reproducers for two previously-known bugs: #3882: Support "ALTER TABLE DROP COMPACT STORAGE" #6447: TTL unexpected behavior when setting to 0 on a table with default_time_to_live Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12248	2022-12-11 21:42:57 +02:00
Nadav Har'El	09a3c63345	cross-tree: allow std::source_location in clang 14 We recently (commit `6a5d9ff261`) started to use std::source_location instead of std::experimental::source_location. However, this does not work on clang 14, because libc++ 12's <source_location> only works if __builtin_source_location, and that is not available on clang 14. clang 15 is just three months old, and several relatively-recent distributions still carry clang 14 so it would be nice to support it as well. So this patch adds a trivial compatibility header file, which, when included and compiled with clang 14, it aliases the functional std::experimental::source_location to std::source_location. It turns out it's enough to include the new header file from three headers that included <source_location> - I guess all other uses of source_location depend on those header files directly or indirectly. We may later need to include the compatibility header file in additional places, bug for now we don't. Refs #12259 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12265	2022-12-11 20:28:49 +02:00
Avi Kivity	e6ffc22053	Merge 'cql3: Server-side DESC statement' from Michał Jadwiszczak This PR adds server-side `DESCRIBE` statement, which is required in latest cqlsh version. The only change from the user perspective is the `DESC ...` statement can be used with cqlsh version >= 6.0. Previously the statement was executed from client side, but starting with Cassandra 4.0 and cqlsh 6.0, execution of describe was moved to server side, so the user was unable to do `DESC ...` with Scylla and cqlsh 6.0. Implemented describe statements: - `DESC CLUSTER` - `DESC [FULL] SCHEMA` - `DESC [ONLY] KEYSPACE` - `DESC KEYSPACES/TYPES/FUNCTIONS/AGGREGATES/TABLES` - `DESC TYPE/FUNCTION/AGGREGATE/MATERIALIZED VIEW/INDEX/TABLE` - `DESC` [Cassandra's implementation for reference](https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/statements/DescribeStatement.java) Changes in this patch: - cql3::util: added `single_quite()` function - added `data_dictionary::keyspace_element` interface - implemented `data_dictionary::keyspace_element` for: - keyspace_metadata, - UDT, UDF, UDA - schema - cql3::functions: added `get_user_functions()` and `get_user_aggregates()` to get all UDFs/UDAs in specified keyspace - data_dictionary::user_types_metadata: added `has_type()` function - extracted `describe_ring()` from storage_service to standalone helper function in `locator/util.hh` - storage_proxy: added `describe_ring()` (implemented using helper function mentioned above) - extended CQL grammar to handle describe statement - increased version in `version.hh` to 4.0.0, so cqlsh will use server-side describe statement Referring: https://github.com/scylladb/scylla/issues/9571, https://github.com/scylladb/scylladb/issues/11475 Closes #11106 * github.com:scylladb/scylladb: version: Increasing version cql-pytest: Add tests for server-side describe statement cql-pytest: creating random elements for describe's tests cql3: Extend CQL grammar with server-side describe statement cql3:statements: server-side describe statement data_dictonary: add `get_all_keyspaces()` and `get_user_keyspaces()` storage_proxy: add `describe_ring()` method storage_service, locator: extract describe_ring() data_dictionary:user_types_metadata: add has_type() function cql3:functions: `get_user_functions()` and `get_user_aggregates()` implement `keyspace_element` interface data_dictionary: add `keyspace_element` interface cql3: single_quote() util function view: row_lock: lock_ck: reindent test/topology: enable replace tests service/raft: report an error when Raft ID can't be found in `raft_group0::remove_from_group0` service: handle replace correctly with Raft enabled gms/gossiper: fetch RAFT_SERVER_ID during shadow round service: storage_service: sleep 2*ring_delay instead of BROADCAST_INTERVAL before replace	2022-12-11 18:29:36 +02:00
Michał Jadwiszczak	8d88c9721e	version: Increasing version The `current()` version in version.hh has to be increased to at least 4.0.0, so server-side describe will be used. Otherwise, cqlsh returns warning that client-side describe is not supported.	2022-12-10 12:51:05 +01:00
Michał Jadwiszczak	3ddde7c5ad	cql-pytest: Add tests for server-side describe statement	2022-12-10 12:51:05 +01:00
Michał Jadwiszczak	f91d05df43	cql-pytest: creating random elements for describe's tests Add helper functions to create random elements (keyspaces, tables, types) to increase the coverage of describe statment's tests. This commit also adds `random_seed` fixture. The fixture should be always used when using random functions. In case of test's failure, the seed will be present in test's signature and the case can be easili recreated. After the test finishes, the fixture restores state of `random` to before-test state.	2022-12-10 12:51:05 +01:00
Michał Jadwiszczak	c563b2133c	cql3: Extend CQL grammar with server-side describe statement	2022-12-10 12:51:05 +01:00
Michał Jadwiszczak	e572d5f111	cql3:statements: server-side describe statement Starting from cqlsh 6.0.0, execution of the describe statement was moved from the client to the server. This patch implements server-side describe statement. It's done by simply fetching all needed keyspace elements (keyspace/table/index/view/UDT/UDF/UDA) and generating the desired description or list of names of all elements. The description of any element has to respect CQL restrictions(like name's quoting) to allow quickly recreate the schema by simply copy-pasting the descritpion.	2022-12-10 12:51:05 +01:00
Michał Jadwiszczak	673393d88a	data_dictonary: add `get_all_keyspaces()` and `get_user_keyspaces()` Adds functions to `data_dictionary::database` in order to obtain names of all keyspaces/all user keyspaces.	2022-12-10 12:51:05 +01:00
Michał Jadwiszczak	360dbf98f1	storage_proxy: add `describe_ring()` method In order to execute `DESC CLUSTER`, there has to be a way to describe ring. `storage_service` is not available at query execution. This patch adds `describe_ring()` as a method of `storage_proxy()` (using helper function from `locator/util.hh`).	2022-12-10 12:51:05 +01:00
Michał Jadwiszczak	dd46a92e23	storage_service, locator: extract describe_ring() `describe_ring()` was implemented as a method of `storage_service`. This patch extracts it from there to a standalone helper function in `locator/util.hh`.	2022-12-10 12:51:05 +01:00
Michał Jadwiszczak	51a02e3bd7	data_dictionary:user_types_metadata: add has_type() function Adds `has_type()` function to `user_types_metadata`. The functions determins whether UDT with given name exists.	2022-12-10 12:50:52 +01:00
Michał Jadwiszczak	06cd03d3cd	cql3:functions: `get_user_functions()` and `get_user_aggregates()` Helper functions to obtain UDFs/UDAs for certain keyspace.	2022-12-10 12:36:59 +01:00
Michał Jadwiszczak	29ad5a08a8	implement `keyspace_element` interface This patch implements `data_dictionary::keyspace_element` interfece in: `keyspace_metadata`, `user_type_impl`, `user_function`, `user_aggregate` and schema.	2022-12-10 12:34:09 +01:00
Michał Jadwiszczak	f30378819d	data_dictionary: add `keyspace_element` interface A common interace for all keyspace elements, which are: keyspace, UDT, UDF, UDA, tables, views, indexes. The interface is to have a unified way to describe those elements.	2022-12-10 12:27:38 +01:00
Michał Jadwiszczak	0589116991	cql3: single_quote() util function `single_quote()` takes a string and transforms it to a string which can be safely used in CQL commands. Single quoting involves wrapping the name in single-quotes ('). A sigle-quote character itself is quoted by doubling it. Single quoting is necessary for dates, IP addresses or string literals.	2022-12-10 12:27:22 +01:00
Benny Halevy	9c2a5a755f	view: row_lock: lock_ck: reindent Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-12-10 12:27:22 +01:00
Kamil Braun	c43e64946a	test/topology: enable replace tests Also add some TODOs for enhancing existing tests.	2022-12-10 12:27:22 +01:00
Kamil Braun	b01cba8206	service/raft: report an error when Raft ID can't be found in `raft_group0::remove_from_group0` Also simplify the code and improve logging in general. The previous code did this: search for the ID in the address map. If it couldn't be found, perform a read barrier and search again. If it again couldn't be found, return. This algorithm depended on the fact that IP addresses were stored in group 0 configuration. The read barrier was used to obtain the most recent configuration, and if the IP was not a part of address map after the read barrier, that meant it's simply not a member of group 0. This logic no longer applies so we can simplify the code. Furthermore, when I was fixing the replace operation with Raft enabled, at some point I had a "working" solution with all tests passing. But I was suspicious and checked if the replaced node got removed from group 0. It wasn't. So the replace finished "successfully", but we had an additional (voting!) member of group 0 which didn't correspond to a token ring member. The last version of my fixes ensure that the node gets removed by the replacing node. But the system is fragile and nothing prevents us from breaking this again. At least log an error for now. Regression tests will be added later.	2022-12-10 12:27:22 +01:00
Kamil Braun	c65f4ae875	service: handle replace correctly with Raft enabled We must place the Raft ID obtained during the shadow round in the address map. It won't be placed by the regular gossiping route if we're replacing using the same IP, because we override the application state of the replaced node. Even if we replace a node with a different IP, it is not guaranteed that background gossiping manages to update the address map before we need it, especially in tests where we set ring_delay to 0 and disable wait_for_gossip_to_settle. The shadow round, on the other hand, performs a synchronous request (and if it fails during bootstrap, bootstrap will fail - because we also won't be able to obtain the tokens and Host ID of the replaced node). Fetch the Raft ID of the replaced node in `prepare_replacement_info`, which runs the shadow round. Return it in `replacement_info`. Then `join_token_ring` passes it to `setup_group0`, which stores it in the address map. It does that after `join_group0` so the entry is non-expiring (the replaced node is a member of group 0). Later in the replace procedure, we call `remove_from_group0` for the replaced node. `remove_from_group0` will be able to reverse-translate the IP of the replaced node to its Raft ID using the address map.	2022-12-10 12:27:22 +01:00
Kamil Braun	60217d7f50	gms/gossiper: fetch RAFT_SERVER_ID during shadow round During the replace operation we need the Raft ID of the replaced node. The shadow round is used for fetching all necessary information before the replace operation starts.	2022-12-10 12:27:22 +01:00
Kamil Braun	b424cc40fa	service: storage_service: sleep 2ring_delay instead of BROADCAST_INTERVAL before replace Most of the sleeps related to gossiping are based on `ring_delay`, which is configurable and can be set to lower value e.g. during tests. But for some reason there was one case where we slept for a hardcoded value, `service::load_broadcaster::BROADCAST_INTERVAL` - 60 seconds. Use `2 get_ring_delay()` instead. With the default value of `ring_delay` (30 seconds) this will give the same behavior.	2022-12-10 12:27:22 +01:00
Anna Stuchlik	8d1050e834	docs: replace Scylla with ScyllaDB on the Snitches page	2022-12-09 13:34:18 +01:00
Anna Stuchlik	5cb191d5b0	docs: fix the headings on the Snitches page	2022-12-09 13:26:36 +01:00
Anna Stuchlik	a699904374	doc: add the description of AzureSnitch to the documentation	2022-12-09 13:22:01 +01:00
Nadav Har'El	e47794ed98	test/cql-pytest: regression test for index scan with start token When we have a table with partition key p and an indexed regular column v, the test included in this patch checks the query SELECT p FROM table WHERE v = 1 AND TOKEN(p) > 17 This can work and not require ALLOW FILTERING, because the secondary index posting-list of "v=1" is ordered in p's token order (to allow SELECT with and without an index to return the same order - this is explained in issue #7443). So this test should pass, and indeed it does on both current Scylla, and Cassandra. However, it turns out that this was a bug - issue #7043 - in older versions of Scylla, and only fixed in Scylla 4.6. In older versions, the SELECT wasn't accepted, claiming it requires ALLOW FILTERING, and if ALLOW FILTERING was added, the TOKEN(p) > 17 part was silently ignored. The fix for issue #7043 actually included regression tests, C++ tests in test/boost/secondary_index_test.cc. But in this patch we also add a Python test in test/cql-pytest. One of the benefits of cql-pytest is that we can (and I did) run the same test on Cassandra to verify we're not implementing a wrong feature. Another benefit is that we can run a new test on an old version, and not even require re-compilation: You can run this new test on any existing installation of Scylla to check if it still has issue #7043. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12237	2022-12-09 09:33:16 +02:00
Benny Halevy	018dedcc0c	docs: replace-dead-node: update host_id of replacing node The replacing node no longer assumes the host_id of the replacee. It will continue to use a random, unique host_id. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-12-09 08:23:31 +02:00
Benny Halevy	37d75e5a21	docs: replace-dead-node: fix alignment	2022-12-09 08:23:31 +02:00
Benny Halevy	89920d47d6	db: system_keyspace: change set_local_host_id to private set_local_random_host_id Now that the local host_id is never changed externally (by the storage_service upon replace-node), the method can be made private and be used only for initializing the local host_id to a random one. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-12-09 08:23:31 +02:00
Benny Halevy	9942c60d93	storage_service: do not inherit the host_id of a replaced a node We want to always be able to distinguish between the replacing node and the replacee by using different, unique, host identifiers. This will allow us to use the host_id authoritatively to identify the node (rather then its endpoint ip address) for token mapping and node operations. Also, it will be used in the following patch to never allow the replaced node to rejoin the cluster, as its host_id should never be reused. Refs #9839 Refs #12040 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-12-09 08:23:31 +02:00
Pavel Emelyanov	7197757750	broadcast_tables: Forward-declare storage_proxy in lang.hh Currently the header includes storage_proxy.hh and spreads this over the code via raft_group0_client.hh -> group0_state_machine.hh -> lang.hh Forward declaring proxy class it eliminates ~100 indirect dependencies on storage_proxy.hh via this chain. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #12241	2022-12-09 01:23:51 +02:00
Pavel Emelyanov	6075e01312	test/lib: Remove sstable_utils.hh from simple_schema.hh The latter is pretty popular test/lib header that disseminates the former one over whole lot of unit tests. The former, in turn, naturally includes sstables.hh thus making tons of unrelated tests depend on sstables class unused by them. However, simple removal doesn't work, becase of local_shard_only bool class definition in sstable_utils.hh used in simple_schema.hh. This thing, in turn, is used in keys making helpers that don't belong to sstable utils, so these are moved into simple_schema as well. When done, this affects the mutation_source_test.hh, which needs the local_shard_only bool class (and helps spreading the sstables.hh throughout more unrelated tests) and a bunch of .cc test sources that used sstable_utils.hh to indirectly include various headers of their demand. After patching, sstables.hh touches 2x times less tests. As a side effect the sstables_manager.hh also becomes 2x times less dependent on by tests. Continuation of `9bdea110a6` Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #12240	2022-12-08 15:37:33 +02:00
Tomasz Grabiec	4e7ddb6309	position_in_partition: Introduce before_key(position_in_partition_view)	2022-12-08 13:41:28 +01:00
Tomasz Grabiec	536c0ab194	db: Fix trim_clustering_row_ranges_to() for non-full keys and reverse order trim_clustering_row_ranges_to() is broken for non-full keys in reverse mode. It will trim the range to position_in_partition_view::after_key(full_key) instead of position_in_partition_view::before_key(key), hence it will include the key in the resulting range rather than exclude it. Fixes #12180 Refs #1446	2022-12-08 13:41:28 +01:00
Tomasz Grabiec	232ce699ab	types: Fix comparison of frozen sets with empty values A frozen set can be part of the clustering key, and with compact storage, the corresponding key component can have an empty value. Comparison was not prepared for this, the iterator attempts to deserialize the item count and will fail if the value is empty. Fixes #12242	2022-12-08 13:41:11 +01:00
Nadav Har'El	4cdaba778d	Merge 'Secondary indexes on static columns' from Piotr Dulikowski This pull request introduces support for global secondary indexes based on static columns. Local secondary indexes based on secondary columns are not planned to be supported and are explicitly forbidden. Because there is only one static row per partition and local indexes require full partition key when querying, such indexes wouldn't be very useful and would only waste resources. The index table for secondary indexes on static columns, unlike other secondary indexes, do not contain clustering keys from the base table. A static column's value determines a set of full partitions, so the clustering keys would only be unnecessary. The already existing logic for querying using secondary indexes works after introducing minimal notifications. The view update generation path now works on a common representation of static and clustering rows, but the new representation allowed to keep most of the logic intact. New cql-pytests are added. All but one of the existing tests for secondary indexes on static columns - ported from Cassandra - now work and have their `xfail` marks lifted; the remaining test requires support for collection indexing, so it will start working only after #2962 is fixed. Materialized view with static rows as a key are __not__ implemented in this PR. Fixes: #2963 Closes #11166 * github.com:scylladb/scylladb: test_materialized_view: verify that static columns are not allowed test_secondary_index: add (currently failing) test for static index paging test_secondary_index: add more tests for secondary indexes on static columns cassandra_tests: enable existing tests for static columns create_index_statement: lift restriction on secondary indexes on static rows db/view: fetch and process static rows when building indexes gms/feature_service: introduce SECONDARY_INDEXES_ON_STATIC_COLUMNS cluster feature create_index_statement: disallow creation of local indexes with static columns select_statement: prepare paging for indexes on static columns select_statement: do not attempt to fetch clustering columns from secondary index's table secondary_index_manager: don't add clustering key columns to index table of static column index replica/table: adjust the view read-before-write to return static rows when needed db/view: process static rows in view_update_builder::on_results db/view: adjust existing view update generation path to use clustering_or_static_row column_computation: adjust to use clustering_or_static_row db/view: add clustering_or_static_row deletable_row: add column_kind parameter to is_live view_info: adjust view_column to accept column_kind db/view: base_dependent_view_info: split non-pk columns into regular and static	2022-12-08 09:54:05 +02:00
Konstantin Osipov	02c30ab5d6	build: fix link error (abseil) on ubuntu toolchain with clang 15 abseil::hash depends on abseil::city and declareds CityHash32 as an external symbol. The city library static library, however, precedes hash in the link list, which apparently makes the linker simply drop it from the object list, since its symbols are not used elsewhere. Fix the linker ordering to help linker see that CityHash32 is used. Closes #12231	2022-12-08 09:47:16 +02:00
Avi Kivity	d6457778f1	Merge 'Coroutinize some table functions in preparation to static compaction groups' from Raphael "Raph" Carvalho Extracted from https://github.com/scylladb/scylladb/pull/12139 Closes #12236 * github.com:scylladb/scylladb: replica: table: Fix indentation replica: coroutinize table::discard_sstables() replica: Coroutinize table::flush()	2022-12-08 09:29:58 +02:00
Piotr Dulikowski	4883e43677	test_materialized_view: verify that static columns are not allowed Adds a test which verifies that static columns are not allowed in materialized views. Although we added support for static columns in secondary indexes, which share a lot of code with materialized views, static columns in materialized views are not yet ready to use.	2022-12-08 07:41:33 +01:00
Piotr Dulikowski	f864944dcb	test_secondary_index: add (currently failing) test for static index paging Currently, when executing queries accelerated by an index on a static column, paging is unable to break base table partitions across pages and is forced to return them in whole. This will cause problems if such a query must return a very large base table partition because it will have to be loaded into memory. Fixing this issue will require a more sophisticated approach than what was done in the PR. For the time being, an xfailing pytest is added which should start passing after paging is improved.	2022-12-08 07:41:33 +01:00
Piotr Dulikowski	4f836115fd	test_secondary_index: add more tests for secondary indexes on static columns Adds cql-pytests which test the secondary index on static columns feature.	2022-12-08 07:41:32 +01:00
Botond Dénes	897b501ba3	Merge 'doc: update the 5.1 upgrade guide with the mode-related information' from Anna Stuchlik This PR adds the link to the KB article about updating the mode after the upgrade to the 5.1 upgrade guide. In addition, I have: - updated the KB article to include the versions affected by that change. - fixed the broken link to the page about metric updates (it is not related to the KB article, but I fixed it in the same PR to limit the number of PRs that need to be backported). Related: https://github.com/scylladb/scylladb/pull/11122 Closes #12148 * github.com:scylladb/scylladb: doc: update the releases in the KB about updating the mode after upgrade doc: fix the broken link in the 5.1 upgrade guide doc: add the link to the 5.1-related KB article to the 5.1 upgrade guide	2022-12-08 07:32:10 +02:00
Tomasz Grabiec	992a73a861	row_cache: Destroy coroutine under region's allocator The reason is alloc-dealloc mismatch of position_in_partition objects allocated by cursors inside coroutine object stored in the update variable in row_cache::do_update() It is allocated under cache region, but in case of exception it will be destroyed under the standard allocator. If update is successful, it will be cleared under region allocator, so there is not problem in the normal case. Fixes #12068 Closes #12233	2022-12-07 21:44:21 +02:00
Raphael S. Carvalho	9ae0d8ba28	replica: table: Fix indentation Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-07 15:53:22 -03:00
Raphael S. Carvalho	b9a33d5a91	replica: coroutinize table::discard_sstables() Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-07 15:52:36 -03:00
Raphael S. Carvalho	192b64a5ac	replica: Coroutinize table::flush() Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-12-07 15:52:27 -03:00
Benny Halevy	a076ceef97	view: row_lock: lock_ck: reindent Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-12-07 19:27:30 +02:00
Avi Kivity	909fbfdd2f	repair: reindent repair_range	2022-12-07 18:17:21 +02:00
Avi Kivity	796ec5996f	repair: coroutinize repair_range	2022-12-07 18:13:10 +02:00
Benny Halevy	78c5961114	docs: operating-scylla: add-node-to-cluster: deleted instructions for unsupported releases 2.3 and 2018.1 ended their life and are long gone. No need to have instructions for them in the master version of this document. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-12-07 17:07:35 +02:00
Benny Halevy	adeb03e60f	docs: operating-scylla: add-node-to-cluster: cleanup: move tips to a note And be more verbose about why the tips are recommended and their ramifications. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-12-07 17:07:18 +02:00
Benny Halevy	6e324137bd	docs: operating-scylla: add-node-to-cluster: improve wording of cleanup instructions "use `nodetool cleanup` cleanup command" repeats words, change to "run the `nodetool cleanup` command". Also, improve the description of the cleanup action and how it relate to the bootstrapping process. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-12-07 17:07:08 +02:00
Benny Halevy	eeed330647	docs: operating-scylla: prerequisites: system_auth is a keyspace, not a table Fix the phrase referring to it as a table respectively. Also, do some minor phrasing touch-ups in this area. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-12-07 17:06:54 +02:00
Benny Halevy	5d840d4232	docs: operating-scylla: prerequisites: no Authetication status is gathered Authetication status isn't gathered from scylla.yaml, only the authenticator, so change the caption respectively. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-12-07 17:06:48 +02:00
Benny Halevy	9cb7056d3e	docs: operating-scylla: prerequisites: simplify grep commands Writing `cat X \| grep Y` is both inefficient and somewhat unprofessional. The grep command works very well on a file argument so `grep Y X` will do the job perfectly without the need for a pipe. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-12-07 17:06:36 +02:00
Benny Halevy	71bc12eecc	docs: operating-scylla: add-node-to-cluster: prerequisites: number sub-sections To improve their readability. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-12-07 17:06:35 +02:00
Benny Halevy	16db7bea82	docs: operating-scylla: add-node-to-cluster: describe other nodes in plural Typically data will be streamed from multiple existing nodes to the new node, not from a single one. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-12-07 17:03:23 +02:00
Tomasz Grabiec	a46b2e4e4c	Merge 'Make node replace procedure work with Raft' from Kamil Braun We need to obtain the Raft ID of the replaced node during the shadow round and place it in the address map. It won't be placed by the regular gossiping route if we're replacing using the same IP, because we override the application state of the replaced node. Even if we replace a node with a different IP, it is not guaranteed that background gossiping manages update the address map before we need it, especially in tests where we set ring_delay to 0 and disable wait_for_gossip_to_settle. The shadow round, on the other hand, performs a synchronous request (and if it fails during bootstrap, bootstrap will fail - because we also won't be able to obtain the tokens and Host ID of the replaced node). Fetch the Raft ID of the replaced node in `prepare_replacement_info`, which runs the shadow round. Return it in `replacement_info`. Then `join_token_ring` passes it to `setup_group0`, which stores it in the address map. It does that after `join_group0` so the entry is non-expiring (the replaced node is a member of group 0). Later in the replace procedure, we call `remove_from_group0` for the replaced node. `remove_from_group0` will be able to reverse-translate the IP of the replaced node to its Raft ID using the address map. Also remove an unconditional 60 seconds sleep from the replace code. Make it dependent on ring_delay. Enable the replace tests. Modify some code related to removing servers from group 0 which depended on storing IP addresses in the group 0 configuration. Closes #12172 * github.com:scylladb/scylladb: test/topology: enable replace tests service/raft: report an error when Raft ID can't be found in `raft_group0::remove_from_group0` service: handle replace correctly with Raft enabled gms/gossiper: fetch RAFT_SERVER_ID during shadow round service: storage_service: sleep 2*ring_delay instead of BROADCAST_INTERVAL before replace	2022-12-07 15:30:27 +01:00
Pavel Emelyanov	9bdea110a6	code: Reduce fanout of sstables(_manager)?.hh over headers This change removes sstables.hh from some other headers replacing it with version.hh and shared_sstable.hh. Also this drops sstables_manager.hh from some more headers, because this header propagates sstables.hh via self. That change is pretty straightforward, but has a recochet in database.hh that needs disk-error-handler.hh. Without the patch touch sstables/sstable.hh results in 409 targets recompillation, with the patch -- 299 targets. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #12222	2022-12-07 14:34:19 +02:00
Botond Dénes	57a4971962	Merge 'dirty_memory_manager: tidy up' from Avi Kivity Tidy up namespaces, move code to the right file, and move the whole thing to the replica module where it belongs. Closes #12219 * github.com:scylladb/scylladb: dirty_memory_manager: move implementaton from database.cc dirty_memory_manager: move to replica module test: dirty_memory_manager_test: disambiguate classes named 'test_region_group' dirty_memory_manager: stop using using namespace	2022-12-07 14:25:59 +02:00
Avi Kivity	f7f5700289	dirty_memory_manager: move implementaton from database.cc A few leftover method implementations were left in database.cc when dirty_memory_manager.cc was created, move them to their correct place now.	2022-12-06 22:28:54 +02:00
Avi Kivity	444de2831e	dirty_memory_manager: move to replica module It's a replica-side thing, so move it there. The related flush_permit and sstable_write_permit are moved alongside.	2022-12-06 22:24:17 +02:00
Avi Kivity	a038a35ad6	test: dirty_memory_manager_test: disambiguate classes named 'test_region_group' There are two similarly named classes: ::test_region_group and dirty_memory_manager_logalloc::test_region_group. Rename the former to ::raii_region_group (that's what it's for) and the latter to ::test_region_group, to reduce confusion.	2022-12-06 22:20:38 +02:00
Avi Kivity	dfdae5ffa9	dirty_memory_manager: stop using using namespace `using namespace` is pretty bad, especially in a header, as it pollutes the namespace for everyone. Stop using it and qualify names instead.	2022-12-06 21:37:38 +02:00
Avi Kivity	47a8fad2a2	Merge 'scylla-types: add serialize action' from Botond Dénes Serializes the value that is an instance of a type. The opposite of `deserialize` (previously known as `print`). All other actions operate on serialized values, yet up to now we were missing a way to go from human readable values to serialized ones. This prevented for example using `scylla types tokenof $pk` if one only had the human readable key value. Example: ``` $ scylla types serialize -t Int32Type -- -1286905132 b34b62d4 $ scylla types serialize --prefix-compound -t TimeUUIDType -t Int32Type -- d0081989-6f6b-11ea-0000-0000001c571b 16 0010d00819896f6b11ea00000000001c571b000400000010 $ scylla types serialize --prefix-compound -t TimeUUIDType -t Int32Type -- d0081989-6f6b-11ea-0000-0000001c571b 0010d00819896f6b11ea00000000001c571b ``` Closes #12029 * github.com:scylladb/scylladb: docs: scylla-types.rst: add mention of per-operation --help tools/scylla-types: add serialize operation tools/scylla-types: prepare for action handlers with string arguments tools/scylla-types: s/print/deserialize/ operation docs: scylla-types.rst: document tokenof and shardof docs: scylla-types.rst: fix typo in compare operation description	2022-12-06 19:27:15 +02:00
Nadav Har'El	f275bfd57b	Update CODEOWNERS file Update the CODEOWNERS file with some people who joined different parts of the project, and one person that left. Note that despite is name, CODEOWNERS does not list "ownership" in any strict sense of the word - it is more about who is willing and/or knowledgeable enough to participate in reviewing changes to particular files or directories. Github uses this file to automatically suggest who should review a pull request. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12216	2022-12-06 19:26:03 +02:00
Benny Halevy	5007ded2c1	view: row_lock: lock_ck: serialize partition and row locking The problematic scenario this patch fixes might happen due to unfortunate serialization of locks/unlocks between lock_pk and lock_ck, as follows: 1. lock_pk acquires an exclusive lock on the partition. 2.a lock_ck attempts to acquire shared lock on the partition and any lock on the row. both cases currently use a fiber returning a future<rwlock::holder>. 2.b since the partition is locked, the lock_partition times out returning an exceptional future. lock_row has no such problem and succeeds, returning a future holding a rwlock::holder, pointing to the row lock. 3.a the lock_holder previously returned by lock_pk is destroyed, calling `row_locker::unlock` 3.b row_locker::unlock sees that the partition is not locked and erases it, including the row locks it contains. 4.a when_all_succeeds continuation in lock_ck runs. Since the lock_partition future failed, it destroyes both futures. 4.b the lock_row future is destroyed with the rwlock::holder value. 4.c ~holder attempts to return the semaphore units to the row rwlock, but the latter was already destroyed in 3.b above. Acquiring the partition lock and row lock in parallel doesn't help anything, but it complicates error handling as seen above, This patch serializes acquiring the row lock in lock_ck after locking the partition to prevent the above race. This way, erasing the unlocked partition is never expected to happen while any of its rows locks is held. Fixes #12168 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #12208	2022-12-06 16:29:46 +02:00
Botond Dénes	f017e9f1c6	docs: document the reader concurrency semaphore diagnostics dump The diagnostics dumped by the reader concurrency semaphore are pretty common-sight in logs, as soon as a node becomes problematic. The reason is that the reader concurrency semaphore acts as the canary in the coal mine: it is the first that starts screaming when the node or workload is unhealthy. This patch adds documentation of the content of the diagnostics and how to diagnose common problems based on it. Fixes: #10471 Closes #11970	2022-12-06 16:24:44 +02:00
Botond Dénes	c35cee7e2b	docs: scylla-types.rst: add mention of per-operation --help	2022-12-06 14:47:28 +02:00
Botond Dénes	4f9799ce4f	tools/scylla-types: add serialize operation Takes human readable values and converts them to serialized hex encoded format. Only regular atomic types are supported for now, no collection/UDT/tuple support, not even in frozen form.	2022-12-06 14:46:53 +02:00
Botond Dénes	7c87655b4b	tools/scylla-types: prepare for action handlers with string arguments Currently all action handlers have bytes arguments, parsed from hexadecimal string representations. We plan on adding a serialize command which will require raw string arguments. Prepare the infrastructure for supporting both types of action handlers.	2022-12-06 14:45:30 +02:00
Botond Dénes	15452730fb	tools/scylla-types: s/print/deserialize/ operation Soon we will have a serialize operation. Rename the current print operation to deserialize in preparation to that. We want the two operations (serialize and deserialize) to reflect their relation in their names too.	2022-12-06 14:45:30 +02:00
Botond Dénes	f98e6552b4	docs: scylla-types.rst: document tokenof and shardof These new actions were added recently but without the accompanying documentation change. Make up for this now.	2022-12-06 14:45:30 +02:00
Botond Dénes	30c047cae6	docs: scylla-types.rst: fix typo in compare operation description	2022-12-06 14:45:23 +02:00
Piotr Dulikowski	680423ad9d	cassandra_tests: enable existing tests for static columns Removes the "xfail" marker from the now-passing tests related to secondary indexes on static columns.	2022-12-06 11:21:16 +01:00
Piotr Dulikowski	cc3af3190d	create_index_statement: lift restriction on secondary indexes on static rows Secondary indexes on static columns should work now. This commit lifts the existing restriction after the cluster is fully upgraded to a version which supports such indexes.	2022-12-06 11:21:16 +01:00
Piotr Dulikowski	86dad30b66	db/view: fetch and process static rows when building indexes This commit modifies the view builder and its consumer so that static rows are always fetched and properly processed during view build. Currently, the view builder will always fetch both static and clustering rows, regardless of the type of indexes being built. For indexes on static columns this is wasteful and could be improved so that only the types of rows relevant to indexes being built are fetched - however, doing this sounds a bit complicated and I would rather start with something simpler which has a better chance of working.	2022-12-06 11:21:16 +01:00
Piotr Dulikowski	25fec0acce	gms/feature_service: introduce SECONDARY_INDEXES_ON_STATIC_COLUMNS cluster feature The new feature will prevent secondary indexes on static columns from being created unless the whole cluster is ready to support them.	2022-12-06 11:21:16 +01:00
Piotr Dulikowski	9f14f0ac09	create_index_statement: disallow creation of local indexes with static columns Local indexes on static columns don't make sense because there is only one static row per partition. It's always better to just run SELECT DISTINCT on the base table. Allowing for such an index would only make such queries slower (due to double lookup), would take unnecessary space and could pose potential consistency problems, so this commit explicitly forbids them.	2022-12-06 11:21:16 +01:00
Piotr Dulikowski	8c4cdfc2db	select_statement: prepare paging for indexes on static columns When performing a query on a table which is accelerated by a secondary index, the paging state returned along with the query contains a partition key and a clustering key of the secondary index table. The logic wasn't prepared to handle the case of secondary indexes on static columns - notably, it tried to put base table's clustering key columns into the paging state which caused problems in other places. This commit fixes the paging logic so that the PK and CK of a secondary index table is calculated correctly. However, this solution has a major drawback: because it is impossible to encode clustering key of the base table in the paging state, partitions returned by queries accelerated by secondary indexes on static columns will _not_ be split by paging. This can be problematic in case there are large partitions in the base table. The main advantage of this fix is that it is simple. Moreover, the problem described above is not unique to static column indexes, but also happens e.g. in case of some indexes on clustering columns (see case 2 of scylladb/scylla#7432). Fixing this issue will require a more sophisticated solution and may affect more than only secondary indexes on static columns, so this is left for a followup.	2022-12-06 11:21:16 +01:00
Piotr Dulikowski	ba390072c5	select_statement: do not attempt to fetch clustering columns from secondary index's table The previous commit made sure that the index table for secondary indexes on static tables don't have columns corresponding to clustering rows in the base table - therefore, we must make sure that we don't try to fetch them when querying the index table.	2022-12-06 11:21:16 +01:00
Piotr Dulikowski	983b440a81	secondary_index_manager: don't add clustering key columns to index table of static column index The implementation of secondary indexes on static columns relies on the fact that the index table only includes partition key columns of the base table, but not clustering key columns. A static column's value determines a set of full partitions, so including the clustering key would only be redundant. It would also generate more work as a single static column update would require a large portion of the index to be updated. This commit makes sure that clustering columns are not included in the index table for indexes based on a static column.	2022-12-06 11:21:16 +01:00
Piotr Dulikowski	6ab41d76e6	replica/table: adjust the view read-before-write to return static rows when needed Adjusts the read-before-write query issued in `table::do_push_view_replica_updates` so that, when needed, requests static columns and makes sure that the static row is present.	2022-12-06 11:21:16 +01:00
Piotr Dulikowski	18be90b1e6	db/view: process static rows in view_update_builder::on_results The `view_update_builder::on_results()` function is changed to react to static rows when comparing read-before-write results with the base table mutation.	2022-12-06 11:21:16 +01:00
Piotr Dulikowski	2dd95d76f1	db/view: adjust existing view update generation path to use clustering_or_static_row The view update path is modified to use `clustering_or_static_row` instead of just `clustering_row`.	2022-12-06 11:21:16 +01:00
Piotr Dulikowski	b0a31bb7a7	column_computation: adjust to use clustering_or_static_row Adjusts the column_computation interface so that it is able to accept both clustering and static rows through the common db::view::clustering_or_static_row interface.	2022-12-06 11:21:16 +01:00
Piotr Dulikowski	986ab6034c	db/view: add clustering_or_static_row Adds a `clustering_or_static_row`, which is a common, immutable representation of either a static or clustering row. It will allow to handle view update generation based on static or clustering rows in a uniform way.	2022-12-06 11:21:16 +01:00
Piotr Dulikowski	05d4328f02	deletable_row: add column_kind parameter to is_live While deletable_row is used to hold regular columns of a clustering row, its name or implementation doesn't suggest that it is a requirement. In fact, some of its methods already take a column_kind parameter which is used to interpret the kind of columns held in the row. This commit removes the assumption about the column kind from the `deletable_row::is_live` method.	2022-12-06 11:21:16 +01:00
Piotr Dulikowski	27c81432cd	view_info: adjust view_column to accept column_kind The `view_info::view_column()` and `view_column` in view.cc allow to get a view's column definition which corresponds to given base table's column. They currently assume that the given column id corresponds to a regular column. In preparation for secondary indexes based on static columns, this commit adjusts those functions so that they accept other kinds of columns, including static columns.	2022-12-06 11:21:16 +01:00
Piotr Dulikowski	f7b7724eaf	db/view: base_dependent_view_info: split non-pk columns into regular and static Currently, `base_dependent_view_info::_base_non_pk_columns_in_view_pk` field keeps a list of non-primary-key columns from the base table which are a part of the view's primary key. Because the current code does not allow indexes on static columns yet, the columns kept in the aforementioned field are always assumed to be regular columns of the base table and are kept as `column_id`s which do not contain information about the column kind. This commit splits the `_base_non_pk_columns_in_view_pk` field into two, one for regular columns and the other for static columns, so that it is possible to keep both kinds of columns in `base_dependent_view_info` and the structure can be used for secondary indexes on static columns.	2022-12-06 11:21:16 +01:00
Botond Dénes	681bd62424	Update tools/java submodule * tools/java ecab7cf7d6...1c4e1e7a7d (2): > Merge "Cqlsh serverless v2" from Karol Baryla > Update Java Driver version to 3.11.2.4	2022-12-06 09:06:09 +02:00
Botond Dénes	6a1dbffaaa	Merge 'compaction_manager: coroutinize postponed_compactions_reevaluation' from Avi Kivity Three lambdas were removed, simplifying the code. Closes #12207 * github.com:scylladb/scylladb: compaction_manager: reindent postponed_compactions_reevaluation() compaction_manager: coroutinize postponed_compactions_reevaluation() compaction_manager: make postponed_compactions_reevaluation() return a future	2022-12-06 08:08:36 +02:00
Avi Kivity	2339a3fa06	database: remove continuation for updating statistics update_write_metrics() is a continuation added solely for updating statistics. Fold it into do_update to reduce an allocation in the write path. ```console $ ./artifacts/before --write --smp 1 2<&1 \| grep insn 189930.77 tps ( 57.2 allocs/op, 13.2 tasks/op, 50994 insns/op, 0 errors) 189954.18 tps ( 57.2 allocs/op, 13.2 tasks/op, 51086 insns/op, 0 errors) 188623.86 tps ( 57.2 allocs/op, 13.2 tasks/op, 51083 insns/op, 0 errors) 190115.01 tps ( 57.2 allocs/op, 13.2 tasks/op, 51092 insns/op, 0 errors) 190173.71 tps ( 57.2 allocs/op, 13.2 tasks/op, 51083 insns/op, 0 errors) median 189954.18 tps ( 57.2 allocs/op, 13.2 tasks/op, 51086 insns/op, 0 errors) ``` vs ```console $ ./artifacts/after --write --smp 1 2<&1 \| grep insn 190358.38 tps ( 56.2 allocs/op, 12.2 tasks/op, 50754 insns/op, 0 errors) 185222.78 tps ( 56.2 allocs/op, 12.2 tasks/op, 50789 insns/op, 0 errors) 184508.09 tps ( 56.2 allocs/op, 12.2 tasks/op, 50842 insns/op, 0 errors) 142099.47 tps ( 56.2 allocs/op, 12.2 tasks/op, 50825 insns/op, 0 errors) 190447.22 tps ( 56.2 allocs/op, 12.2 tasks/op, 50811 insns/op, 0 errors) ``` One allocation and ~300 cycles saved. update_write_metrics() is still called from other call sites, so it is not removed. Closes #12108	2022-12-06 07:04:17 +02:00
Botond Dénes	6daa1e973f	Merge 'alternator: fix hangs related to TTL scanning' from Nadav Har'El The first patch in this small series fixes a hang during shutdown when the expired-item scanning thread can hang in a retry loop instead of quitting. These hangs were seen in some test runs (issue #12145). The second patch is a failsafe against additional bugs like those solved by the first patch: If any bugs causes the same page fetch to repeatedly time out, let's stop the attempts after 10 retries instead of retrying for ever. When we stop the retries, a warning will be printed to the log, Scylla will wait until the next scan period and start a new scan from scratch - from a random position in the database, instead of hanging potentially-forever waiting for the same page. Closes #12152 * github.com:scylladb/scylladb: alternator ttl: in scanning thread, don't retry the same page too many times alternator: fix hang during shutdown of expiration-scanning thread	2022-12-06 06:44:22 +02:00
Botond Dénes	c5da96e6f7	Merge 'cql3: batch_statement: coroutinize get_mutations()' from Avi Kivity As it has a do_with(), coroutinizing it is an automatic win. Closes #12195 * github.com:scylladb/scylladb: cql3: batch_statement: reindent get_mutations() cql3: batch_statement: coroutinize get_mutations()	2022-12-06 06:41:44 +02:00
Avi Kivity	d2b1d2f695	compaction_manager: reindent postponed_compactions_reevaluation()	2022-12-05 22:02:27 +02:00
Avi Kivity	1669025736	compaction_manager: coroutinize postponed_compactions_reevaluation() So much nicer.	2022-12-05 22:01:41 +02:00
Avi Kivity	d2c44cba77	compaction_manager: make postponed_compactions_reevaluation() return a future postponed_compactions_reevaluation() runs until compaction_manager is stopped, checking if it needs to launch new compactions. Make it return a future instead of stashing its completion somewhere. This makes is easier to convert it to a coroutine.	2022-12-05 21:58:48 +02:00
Avi Kivity	fe4d7fbdf2	Update abseil submodule * abseil 7f3c0d78...4e5ff155 (125): > Add a compilation test for recursive hash map types > Add AbslStringify support for enum types in Substitute. > Use a c++14-style constexpr initialization if c++14 constexpr is available. > Move the vtable into a function to delay instantiation until the function is called. When the variable is a global the compiler is allowed to instantiate it more aggresively and it might happen before the types involved are complete. When it is inside a function the compiler can't instantiate it until after the functions are called. > Cosmetic reformatting in a test. > Reorder base64 unescape methods to be below the escaping methods. > Fixes many compilation issues that come from having no external CI coverage of the accelerated CRC implementation and some differences bewteen the internal and external implementation. > Remove static initializer from mutex.h. > Import of CCTZ from GitHub. > Remove unused iostream include from crc32c.h > Fix MSVC builds that reject C-style arrays of size 0 > Remove deprecated use of absl::ToCrc32c() > CRC: Make crc32c_t as a class for explicit control of operators > Convert the full parser into constexpr now that Abseil requires C++14, and use this parser for the static checker. This fixes some outstanding bugs where the static checker differed from the dynamic one. Also, fix `%v` to be accepted with POSIX syntax. > Write (more) directly into the structured buffer from StringifySink, including for (size_t, char) overload. > Avoid using the non-portable type __m128i_u. > Reduce flat_hash_{set,map} generated code size. > Use ABSL_HAVE_BUILTIN to fix -Wundef __has_builtin warning > Add a TODO for the deprecation of absl::aligned_storage_t > TSAN: Remove report_atomic_races=0 from CI now that it has been fixed > absl: fix Mutex TSan annotations > CMake: Remove trailing commas in `AbseilDll.cmake` > Fix AMD cpu detection. > CRC: Get CPU detection and hardware acceleration working on MSVC x86(_64) > Removing trailing period that can confuse a url in str_format.h. > Refactor btree iterator generation code into a base class rather than using ifdefs inside btree_iterator. > container.h: fix incorrect comments about the location of <numeric> algorithms. > Zero encoded_remaining when a string field doesn't fit, so that we don't leave partial data in the buffer (all decoders should ignore it anyway) and to be sure that we don't try to put any subsequent operands in either (there shouldn't be enough space). > Improve error messages when comparing btree iterators when generations are enabled. > Document the WebSafe* and WithPadding variants more concisely, as deltas from Base64Encode. > Drop outdated comment about LogEntry copyability. > Release structured logging. > Minor formatting changes in preparation for structured logging... > Update absl::make_unique to reflect the C++14 minimum > Update Condition to allocate 24 bytes for MSVC platform pointers to methods. > Add missing include > Refactor "RAW: " prefix formatting into FormatLogPrefix. > Minor formatting changes due to internal refactoring > Fix typos > Add a new API for `extract_and_get_next()` in b-tree that returns both the extracted node and an iterator to the next element in the container. > Use AnyInvocable in internal thread_pool > Remove absl/time/internal/zoneinfo.inc. It was used to guarantee availability of a few timezones for "time_test" and "time_benchmark", but (file-based) zoneinfo is now secured via existing Bazel data/env attributes, or new CMake environment settings. > Updated documentation on use of %v Also updated documentation around FormatSink and PutPaddedString > Use the correct Bazel copts in crc targets > Run the //absl/time timezone tests with a data dependency on, and a matching ${TZDIR} setting for, //absl/time/internal/cctz:zoneinfo. > Stop unnecessary clearing of fields in ~raw_hash_set. > Fix throw_delegate_test when using libc++ with shared libraries > CRC: Ensure SupportsArmCRC32PMULL() is defined > Improve error messages when comparing btree iterators. > Refactor the throw_delegate test into separate test cases > Replace std::atomic_flag with std::atomic<bool> to avoid the C++20 deprecation of ATOMIC_FLAG_INIT. > Add support for enum types with AbslStringify > Release the CRC library > Improve error messages when comparing swisstable iterators. > Auto increase inlined capacity whenever it does not affect class' size. > drop an unused dep > Factor out the internal helper AppendTruncated, which is used and redefined in a couple places, plus several more that have yet to be released. > Fix some invalid iterator bugs in btree_test.cc for multi{set,map} emplace{_hint} tests. > Force a conservative allocation for pointers to methods in Condition objects. > Fix a few lint findings in flags' usage.cc > Narrow some _MSC_VER checks to not catch clang-cl. > Small cleanups in logging test helpers > Import of CCTZ from GitHub. > Merge pull request abseil/abseil-cpp#1287 from GOGOYAO:patch-1 > Merge pull request abseil/abseil-cpp#1307 from KindDragon:patch-1 > Stop disabling some test warnings that have been fixed > Support logging of user-defined types that implement `AbslStringify()` > Eliminate span_internal::Min in favor of std::min, since Min conflicts with a macro in a third-party library. > Fix -Wimplicit-int-conversion. > Improve error messages when dereferencing invalid swisstable iterators. > Cord: Avoid leaking a node if SetExpectedChecksum() is called on an empty cord twice in a row. > Add a warning about extract invalidating iterators (not just the iterator of the element being extracted). > CMake: installed artifacts reflect the compiled ABI > Import of CCTZ from GitHub. > Import of CCTZ from GitHub. > Support empty Cords with an expected checksum > Move internal details from one source file to another more appropriate source file. > Removes `PutPaddedString()` function > Return uint8_t from CappedDamerauLevenshteinDistance. > Remove the unknown CMAKE_SYSTEM_PROCESSOR warning when configuring ABSL_RANDOM_RANDEN_COPTS > Enforce Visual Studio 2017 (MSVC++ 15.0) minumum > `absl::InlinedVector::swap` supports non-assignable types. > Improve b-tree error messages when dereferencing invalid iterators. > Mutex: Fix stall on single-core systems > Document Base64Unescape() padding > Fix sign conversion warnings in memory_test.cc. > Fix a sign conversion warning. > Fix a truncation warning on Windows 64-bit. > Use btree iterator subtraction instead of std::distance in erase_range() and count(). > Eliminate use of internal interfaces and make the test portable and expose it to OSS. > Fix various warnings for _WIN32. > Disables StderrKnobsDefault due to order dependency > Implement btree_iterator::operator-, which is faster than std::distance for btree iterators. > Merge pull request abseil/abseil-cpp#1298 from rpjohnst:mingw-cmake-build > Implement function to calculate Damerau-Levenshtein distance between two strings. > Change per_thread_sem_test from size medium to size large. > Support stringification of user-defined types in AbslStringify in absl::Substitute. > Fix "unsafe narrowing" warnings in absl, 12/12. > Revert change to internal 'Rep', this causes issues for gdb > Reorganize InlineData into an inner Rep structure. > Remove internal `VLOG_xxx` macros > Import of CCTZ from GitHub. > `absl::InlinedVector` supports move assignment with non-assignable types. > Change Cord internal layout, which reduces store-load penalties on ARM > Detects accidental multiple invocations of AnyInvocable<R(...)&&>::operator()&& by producing an error in debug mode, and clarifies that the behavior is undefined in the general case. > Fix a bug in StrFormat. This issue would have been caught by any compile-time checking but can happen for incorrect formats parsed via ParsedFormat::New. Specifically, if a user were to add length modifiers with 'v', for example the incorrect format string "%hv", the ParsedFormat would incorrectly be allowed. > Adds documentation for stringification extension > CMake: Remove check_target calls which can be problematic in case of dependency cycle > Changes mutex unlock profiling > Add static_cast<void> to the sources for trivial relocations to avoid spurious -Wdynamic-class-memaccess errors in the presence of other compilation errors. > Configure ABSL_CACHE_ALIGNED for clang-like and MSVC toolchains. > Fix "unsafe narrowing" warnings in absl, 11/n. > Eliminate use of internal interfaces > Merge pull request abseil/abseil-cpp#1289 from keith:ks/fix-more-clang-deprecated-builtins > Merge pull request abseil/abseil-cpp#1285 from jun-sheaf:patch-1 > Delete LogEntry's copy ctor and assignment operator. > Make sinks provided to `AbslStringify()` usable with `absl::Format()`. > Cast unused variable to void > No changes in OSS. > No changes in OSS > Replace the kPower10ExponentTable array with a formula. > CMake: Mark absl::cord_test_helpers and absl::spy_hash_state PUBLIC > Use trivial relocation for transfers in swisstable and b-tree. > Merge pull request abseil/abseil-cpp#1284 from t0ny-peng:chore/remove-unused-class-in-variant > Removes the legacy spellings of the thread annotation macros/functions by default. Closes #12201	2022-12-05 21:07:16 +02:00
Eliran Sinvani	5a5514d052	cql server: Only parallelize relevant cql requests The cql server uses an execution stage to process and execute queries, however, processing stage is best utilized when having a recurrent flow that needs to be called repeatedly since it better utilizes the instruction cache. Up until now, every request was sent through the processing stage, but most requests are not meant to be executed repeatedly with high volume. This change processes and executes the data queries asynchronously, through an execution stage, and all of the rest are processed one by one, only continuing once the request has been done end to end. Tests: Unit tests in dev and debug. Signed-off-by: Eliran Sinvani <eliransin@scylladb.com> Closes #12202	2022-12-05 21:06:58 +02:00
Takuya ASADA	b7851ab1ec	docker: fix locale on SSH shell `4ecc08c` broke locale settings on SSH shell, since we dropped "update-locale". To fix this without installing locales package, we need to manually specify LANG=C.UTF-8 in /etc/default/locale. see https://github.com/scylladb/scylla-cluster-tests/pull/5519 Closes #12197	2022-12-05 20:02:18 +02:00
Avi Kivity	6f2d060d12	Merge 'Make sstable_directory call sstable_manager for sstables' components' from Pavel Emelyanov This PR hits two goals for "object storage" effort 1. Sstables loader "knows" that sstables components are stored in a Linux directory and uses utils/lister to access it. This is not going to work with sstables over object storage, the loader should be abstracted from the underlying storage. 2. Currently class keyspace and class column_family carry "datadir" and "all_datadirs" on board which are path on local filesystem where sstable files are stored (those usually started with /var/lib/scylla/data). The paths include subsdirs like "snapshots", "staging", etc. This is not going to look nice for obejct storage, the /var/lib/ prefix is excessive and meaningless in this case. Instead, ks and cf should know their "location" and some other component should know the directory where in which the files are stored. Said that, this PR prepares distributed_loader and sstables_directly to stop using Linux paths explicitly by making both call sstables_manager to list and open sstables object. After it will be possible to teach manager to list sstables from object storage. Also this opens the way to removing paths from keyspace and column_family classes and replacing those with relative "location"s. Closes #12128 * github.com:scylladb/scylladb: sstable_directory: Get components lister from manager sstable_directory: Extract directory lister sstable_directory: Remove sstable creation callback sstable_directory: Call manager to make sstables sstable_directory: Keep error handler generator sstable_directory: Keep schema_ptr sstable_directory: Use directory semaphore from manager sstable_directory: Keep reference on manager tests: Use sstables creation helper in some cases sstables_manager: Keep directory semaphore reference sstables, code: Wrap directory semaphore with concurrency	2022-12-05 18:54:17 +02:00
Gleb Natapov	022a825b33	raft: introduce not_a_member error and return it when non member tries to do add/modify_config Currently if a node that is outside of the config tries to add an entry or modify config transient error is returned and this causes the node to retry. But the error is not transient. If a node tries to do one of the operations above it means it was part of the cluster at some point, but since a node with the same id should not be added back to a cluster if it is not in the cluster now it will never be. Return a new error not_a_member to a caller instead. Message-Id: <Y42mTOx8bNNrHqpd@scylladb.com>	2022-12-05 17:11:04 +01:00
Benny Halevy	c61083852c	storage_service: handle_state_normal: calculate candidates_for_removal when replacing tokens We currently try to detect a replaced node so to insert it to endpoints_to_remove when it has no owned tokens left. However, for each token we first generate a multimap using get_endpoint_to_token_map_for_reading(). There are 2 problems with that: 1. unless the replaced node owns a single token, this map will not be empty after erasing one token out of it, since the token metadata has not changed yet (this is done later with update_normal_tokens(owned_tokens, endpoint)). 2. generating this map for each token is inefficient, turning this algorithm complexity to quadratic in the number of tokens... This change copies the current token_to_endpoint map to temporary map and erases replaced tokens from it, while maintaining a set of candidates_for_removal. After traversing all replaced tokens, we check again the `token_to_endpoint_map` erasing from `candidates_for_removal` any endpoint that still owns tokens. The leftover candidates are endpoints the own no tokens and so they are added to `hosts_to_remove`. Fixes #12082 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #12141	2022-12-05 16:17:18 +01:00
Botond Dénes	3d620378d4	Merge 'view: coroutinize maybe_mark_view_as_built' from Avi Kivity Simplifying it a little. Closes #12171 * github.com:scylladb/scylladb: view: reindent maybe_mark_view_as_built view: coroutinize maybe_mark_view_as_built	2022-12-05 13:43:34 +02:00
Kamil Braun	3f8aaeeab9	test/topology: enable replace tests Also add some TODOs for enhancing existing tests.	2022-12-05 11:50:07 +01:00
Kamil Braun	ee19411783	service/raft: report an error when Raft ID can't be found in `raft_group0::remove_from_group0` Also simplify the code and improve logging in general. The previous code did this: search for the ID in the address map. If it couldn't be found, perform a read barrier and search again. If it again couldn't be found, return. This algorithm depended on the fact that IP addresses were stored in group 0 configuration. The read barrier was used to obtain the most recent configuration, and if the IP was not a part of address map after the read barrier, that meant it's simply not a member of group 0. This logic no longer applies so we can simplify the code. Furthermore, when I was fixing the replace operation with Raft enabled, at some point I had a "working" solution with all tests passing. But I was suspicious and checked if the replaced node got removed from group 0. It wasn't. So the replace finished "successfully", but we had an additional (voting!) member of group 0 which didn't correspond to a token ring member. The last version of my fixes ensure that the node gets removed by the replacing node. But the system is fragile and nothing prevents us from breaking this again. At least log an error for now. Regression tests will be added later.	2022-12-05 11:50:07 +01:00
Kamil Braun	4429885543	service: handle replace correctly with Raft enabled We must place the Raft ID obtained during the shadow round in the address map. It won't be placed by the regular gossiping route if we're replacing using the same IP, because we override the application state of the replaced node. Even if we replace a node with a different IP, it is not guaranteed that background gossiping manages to update the address map before we need it, especially in tests where we set ring_delay to 0 and disable wait_for_gossip_to_settle. The shadow round, on the other hand, performs a synchronous request (and if it fails during bootstrap, bootstrap will fail - because we also won't be able to obtain the tokens and Host ID of the replaced node). Fetch the Raft ID of the replaced node in `prepare_replacement_info`, which runs the shadow round. Return it in `replacement_info`. Then `join_token_ring` passes it to `setup_group0`, which stores it in the address map. It does that after `join_group0` so the entry is non-expiring (the replaced node is a member of group 0). Later in the replace procedure, we call `remove_from_group0` for the replaced node. `remove_from_group0` will be able to reverse-translate the IP of the replaced node to its Raft ID using the address map.	2022-12-05 11:50:07 +01:00
Kamil Braun	45bb5bfb52	gms/gossiper: fetch RAFT_SERVER_ID during shadow round During the replace operation we need the Raft ID of the replaced node. The shadow round is used for fetching all necessary information before the replace operation starts.	2022-12-05 11:50:07 +01:00
Kamil Braun	7222c2f9a1	service: storage_service: sleep 2ring_delay instead of BROADCAST_INTERVAL before replace Most of the sleeps related to gossiping are based on `ring_delay`, which is configurable and can be set to lower value e.g. during tests. But for some reason there was one case where we slept for a hardcoded value, `service::load_broadcaster::BROADCAST_INTERVAL` - 60 seconds. Use `2 get_ring_delay()` instead. With the default value of `ring_delay` (30 seconds) this will give the same behavior.	2022-12-05 11:50:07 +01:00
Pavel Emelyanov	b5ede873f2	sstable_directory: Get components lister from manager For now this is almost a no-op because manager just calls sstables_directory code back to create the lister. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-05 12:03:19 +03:00
Pavel Emelyanov	3f9b8c855d	sstable_directory: Extract directory lister Currently the utils/lister.cc code is in use to list regular files in a directory. This patch wraps the lister into more abstract components lister class. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-05 12:03:19 +03:00
Pavel Emelyanov	abd3602b10	sstable_directory: Remove sstable creation callback It's no longer used. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-05 12:03:19 +03:00
Pavel Emelyanov	3d559391df	sstable_directory: Call manager to make sstables Now the directory code has everyhting it needs to create sstable object and can stop using the external lambda. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-05 12:03:19 +03:00
Pavel Emelyanov	db657a8d1c	sstable_directory: Keep error handler generator Yet another continuation to previous patch -- IO error handlers generator is also needed to create sstables. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-05 12:03:19 +03:00
Pavel Emelyanov	4281f4af42	sstable_directory: Keep schema_ptr Continuation of one-before-previous patch. In order to create sstable without external lambda the directory code needs schema. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-05 12:03:19 +03:00
Pavel Emelyanov	8df1bcb907	sstable_directory: Use directory semaphore from manager After previous patch sstables_directory code may no longer require for semaphore argument, because it can get one from manager. This makes the directory API shorter and simpler. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-05 12:03:19 +03:00
Pavel Emelyanov	4da941e159	sstable_directory: Keep reference on manager The sstables_directly accesses /var/lib/scylla/data in two ways -- lists files in it and opens sstables. The latter is abdtracted with the help of lambdas passed around, but the former (listing) is done by using directory liters from utils. Listing sstables components with directlry lister won't work for object storage, the directory code will need to call some abstraction layer instead. Opening sstables with the help of a lambda is a bit of overkill, having sstables manager at hand could make it much simpler. Said that, this patch makes sstables_directly reference sstables_manager on start. This change will also simplify directory semaphore usage (next patch). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-05 12:03:19 +03:00
Pavel Emelyanov	784d78810a	tests: Use sstables creation helper in some cases Several test cases push sstables creation lambda into with_sstables_directory helper. There's a ready to use helper class that does the same. Next patch will make additional use of that. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-05 12:03:19 +03:00
Pavel Emelyanov	5e13ce2619	sstables_manager: Keep directory semaphore reference Preparational patch. The semaphore will be used by sstables_directory in next patches. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-05 12:03:18 +03:00
Pavel Emelyanov	be8512d7cc	sstables, code: Wrap directory semaphore with concurrency Currently this is a sharded<semaphore> started/stopped in main and referenced by database in order to be fed into sstables code. This semaphore always comes with the "concurrency" parameter that limits the parallel_for_each parallelizm. This patch wraps both together into directory_semaphore class. This makes its usage simpler and will allow extending it in the future. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-05 11:59:30 +03:00
Asias He	c6087cf3a0	repair: Reduce repair reader eviction with diff shard count When repair master and followers have different shard count, the repair followers need to create multi-shard readers. Each multi-shard reader will create one local reader on each shard, N (smp::count) local readers in total. There is a hard limit on the number of readers who can work in parallel. When there are more readers than this limit. The readers will start to evict each other, causing buffers already read from disk to be dropped and recreating of readers, which is not very efficient. To optimize and reduce reader eviction overhead, a global reader permit is introduced which considers the multi-shard reader bloats. With this patch, at any point in time, the number of readers created by repair will not exceed the reader limit. Test Results: 1) with stream sem 10, repair global sem 10, 5 ranges in parallel, n1=2 shards, n2=8 shards, memory wanted =1 1.1) [asias@hjpc2 mycluster]$ time nodetool -p 7200 repair ks2 (repair on n2) [2022-11-23 17:45:24,770] Starting repair command #1, repairing 1 ranges for keyspace ks2 (parallelism=SEQUENTIAL, full=true) [2022-11-23 17:45:53,869] Repair session 1 [2022-11-23 17:45:53,869] Repair session 1 finished real 0m30.212s user 0m1.680s sys 0m0.222s 1.2) [asias@hjpc2 mycluster]$ time nodetool repair ks2 (repair on n1) [2022-11-23 17:46:07,507] Starting repair command #1, repairing 1 ranges for keyspace ks2 (parallelism=SEQUENTIAL, full=true) [2022-11-23 17:46:30,608] Repair session 1 [2022-11-23 17:46:30,608] Repair session 1 finished real 0m24.241s user 0m1.731s sys 0m0.213s 2) with stream sem 10, repair global sem no_limit, 5 ranges in parallel, n1=2 shards, n2=8 shards, memory wanted =1 2.1) [asias@hjpc2 mycluster]$ time nodetool -p 7200 repair ks2 (repair on n2) [2022-11-23 17:49:49,301] Starting repair command #1, repairing 1 ranges for keyspace ks2 (parallelism=SEQUENTIAL, full=true) [2022-11-23 17:52:01,414] Repair session 1 [2022-11-23 17:52:01,415] Repair session 1 finished real 2m13.227s user 0m1.752s sys 0m0.218s 2.2) [asias@hjpc2 mycluster]$ time nodetool repair ks2 (repair on n1) [2022-11-23 17:52:19,280] Starting repair command #1, repairing 1 ranges for keyspace ks2 (parallelism=SEQUENTIAL, full=true) [2022-11-23 17:52:42,387] Repair session 1 [2022-11-23 17:52:42,387] Repair session 1 finished real 0m24.196s user 0m1.689s sys 0m0.184s Comparing 1.1) and 2.1), it shows the eviction played a major role here. The patch gives 73s / 30s = 2.5X speed up in this setup. Comparing 1.1 and 1.2, it shows even if we limit the readers, starting on the lower shard is faster 30s / 24s = 1.25X (the total number of multishard readers is lower) Fixes #12157 Closes #12158	2022-12-05 10:47:36 +02:00
Botond Dénes	1e20095547	Update tools/java submodule * tools/java 1c06006447...ecab7cf7d6 (1): > Add VSCode files to gitignore	2022-12-05 09:54:51 +02:00
Botond Dénes	c4d72c8dd0	Merge 'cql3: select_statement: split and coroutinize process_results()' from Avi Kivity Split the simple (and common) case from the complex case, and coroutinize the latter. Hopefully this generates better code for the simple case, and it makes the complex case a little nicer. Closes #12194 * github.com:scylladb/scylladb: cql3: select_statement: reindent process_results_complex() cql3: select_statement: coroutinize process_results_complex() cql3: select_statement: split process_results() into fast path and complex path	2022-12-05 08:16:22 +02:00
Avi Kivity	a0a4711b74	snapshot: protect list operations against the lambda coroutine fiasco run_snapshot_list_operation() takes a continuation, so passing it a lambda coroutine without protection is dangerous. Protect the coroutine with coroutine::lambda so it doesn't lost its contents. Fixes #12192. Closes #12193	2022-12-05 08:14:39 +02:00
guy9	cb842b2729	Replacing the Docs top bar message from the LIVE event to the community forum announcement Closes #12189	2022-12-05 08:05:04 +02:00
Avi Kivity	6326be5796	cql3: batch_statement: reindent get_mutations()	2022-12-04 21:47:22 +02:00
Avi Kivity	2d74360de3	cql3: batch_statement: coroutinize get_mutations() It has a do_with(), so an automatic win.	2022-12-04 21:45:10 +02:00
Avi Kivity	0834bb0365	cql3: select_statement: reindent process_results_complex()	2022-12-04 21:36:17 +02:00
Avi Kivity	a63f98e3fc	cql3: select_statement: coroutinize process_results_complex() Not a huge gain, since it's just a do_with, but still a little better. Note the inner lambda is not a coroutine, so isn't susceptibe to the lambda coroutine fiasco.	2022-12-04 21:34:51 +02:00
Avi Kivity	7f29efa0ad	cql3: select_statement: split process_results() into fast path and complex path This will allow us to coroutinize the complex path without adding an allocation to the fast path.	2022-12-04 21:30:45 +02:00
Avi Kivity	02b66bb31a	Merge 'Mark sstable::<directory accessing methods> private' from Pavel Emelyanov One of the prerequisites to make sstables reside on object-storage is not to let the rest of the code "know" the filesystem path they are located on (because sometimes they will not be on any filesystem path). This patch makes the methods that can reveal this path back private so that later they can be abstracted out. Closes #12182 * github.com:scylladb/scylladb: sstable: Mark some methods private test: Don't get sstable dir when known test: Use move_to_quarantine() helper test: Use sstable::filename() overload without dir name sstables: Reimplement batch directory sync after move table, tests: Make use of move_to_new_dir() default arg sstables: Remove fsync_directory() helper table: Simplify take_snapshot()'s collecting sstables names	2022-12-04 17:45:37 +02:00
Kamil Braun	b551cd254c	test: test_raft_upgrade: fix test_recover_stuck_raft_upgrade flakiness The test enables an error injection inside the Raft upgrade procedure on one of the nodes which will cause the node to throw an exception before entering `synchronize` state. Then it restarts other nodes with Raft enabled, waits until they enter `synchronize` state, puts them in RECOVERY mode, removes the error-injected node and creates a new Raft group 0. As soon as the other nodes enter `synchronize`, the test disabled the error injection (the rest of the test was outside the `async with inject_error(...)` block). There was a small chance that we disabled the error injection before the node reached it. In that case the node also entered `synchronize` and the cluster managed to finish the upgrade procedure. We encountered this during next promotion. Eliminate this possibility by extending the scope of the `async with inject_error(...)` block, so that the RECOVERY mode steps on the other nodes are performed within that block. Closes #12162	2022-12-02 21:26:44 +01:00
Avi Kivity	94f18b5580	test: sstable_conforms_to_mutation_source: use do_with_async() where needed The test clearly needs a thread (it converts a reader to a mutation without waiting), so give it one. Closes #12178	2022-12-02 20:48:37 +01:00
Pavel Emelyanov	084522d9eb	sstable: Mark some methods private There are several class sstable methods that reveal internal directory path to caller. It's not object-storage-friendly. Fortunately, all the callers of those methods had been patched not to work with full paths, so these can be marked private. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-02 21:15:02 +03:00
Pavel Emelyanov	fb63850f2c	test: Don't get sstable dir when known The sstable_move_test creates sstables in its own temp directories and the requests these dirs' paths back from sstables. Test can come with the paths it has at hand, no need to call sstables for it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-02 21:13:58 +03:00
Pavel Emelyanov	4c742a658d	test: Use move_to_quarantine() helper Two places in tests move sstable to quarantine subdir by hand. There's the class sstable method that does the same, so use it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-02 21:13:19 +03:00
Pavel Emelyanov	d6244b7408	test: Use sstable::filename() overload without dir name The dir this place currently uses is the directory where the sstable was created, so dropping this argument would just render the same path. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-02 21:12:21 +03:00
Pavel Emelyanov	a702affd4d	sstables: Reimplement batch directory sync after move There's a table::move_sstables_from_staging() method that gets a bunch of sstables and moves them from staging subdit into table's root datadir. Not to flush the root dir for every sstable move, it asks the sstable::move_to_new_dir() not to flush, but collects staging dir names and flushes them and the root dir at the end altothether. In order to make it more friendly to object-storage and to remove one more caller of sstable::get_dir() the delayed_commit_changes struct is introduced. It collects _all_ the affected dir names in unordered_set, then allows flushing them. By default the move_to_new_dir() doesn't receive this object and flushes the directories instantly. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-02 21:08:47 +03:00
Pavel Emelyanov	1b42d5fce3	table, tests: Make use of move_to_new_dir() default arg The method in question accepts boolean bit whether or not it should sync directories at the end. It's always true but in one case, so there's the default value for it. Make use of it. Anticipating the suggestion to replace bool with bool_class -- next patch will replace it with something else. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-02 21:07:16 +03:00
Pavel Emelyanov	339feb4205	sstables: Remove fsync_directory() helper The one effectively wraps existing seastar sync_directory() helper into two io_check-s. It's simpler just to call the latter directly. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-02 21:05:43 +03:00
Pavel Emelyanov	80f5d7393f	table: Simplify take_snapshot()'s collecting sstables names The method in question "snapshots" all sstables it can find, then writes their Datafile names into the manifest file. To get the list of file names it iterates over sstables list again and does silly conversion of full file path to file name with the help of the directory path length. This all can be made much simpler if just collecting component names directly at the time sstable is hardlinked. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-02 21:02:37 +03:00
Raphael S. Carvalho	d61b4f9dfb	compaction_manager: Delete compaction_state's move constructor compaction_state shouldn't be moved once emplaced. moving it could theoretically cause task's gate holder to have a dangling pointer to compaction_state's gate, but turns out gate's move ctor will actually fail under this assertion: assert(!_count && "gate reassigned with outstanding requests"); Cannot happen today, but let's make it more future proof. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #12167	2022-12-02 20:56:57 +03:00
Tomasz Grabiec	1a6bf2e9ca	Merge 'service/raft: specialized verb for failure detector pinger' from Kamil Braun We used GOSSIP_ECHO verb to perform failure detection. Now we use a special verb DIRECT_FD_PING introduced for this purpose. There are multiple reasons to do so. One minor reason: we want to use the same connection as other Raft verbs: if we can't deliver Raft append_entries or vote messages somewhere, that endpoint should be marked dead; if we can, the endpoint should be marked alive. So putting pings on the same connection as the other Raft verbs is important when dealing with weird situations where some connections are available but others are not. Observe that in `do_get_rpc_client_idx`, we put the new verb in the right place. Another minor reason: we remove the awkward gossiper `echo_pinger` abstraction which required storing and updating gossiper generation numbers. This also removes one dependency from Raft service code to gossiper. Major reason 1: the gossip echo handler has a weird mechanism where a replacing node returns errors during the replace operation to some of the nodes. In Raft however, we want to mark servers as alive when they are alive, including a server running on a node that's replacing another node. Major reason 2, related to the previous one: when server B is replacing server A with the same IP, the failure detector will try to ping both servers. Both servers are mapped to the same IP by the address map, so pings to both servers will reach server B. We want server B to respond to the pings destined for server B, but not to pings destined for server A, so the sender can mark B alive but keep A marked dead. To do this, we include the destination's Raft ID in our RPCs. The destination compares the received ID with its own. If it's different, it returns a `wrong_destination` response, and the failure detector knows that the ping did not reach the destination (it reached someone else). Yet another reason: removes "Not ready to respond gossip echo message" log spam during replace. Closes #12107 * github.com:scylladb/scylladb: service/raft: specialized verb for failure detector pinger db: system_keyspace: de-staticize `{get,set}_raft_server_id` service/raft: make this node's Raft ID available early in group registry	2022-12-02 13:54:02 +01:00
Pavel Emelyanov	71179ff5ab	distributed_loader: Use coroutine::lambda in sleeping coroutine According to seastar/doc/lambda-coroutine-fiasco.md lambda that co_awaits once loses its capture frame. In distrobuted_loader code there's at least one of that kind. fixes: #12175 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #12170	2022-12-02 13:06:33 +02:00
Pavel Emelyanov	1d91914166	sstables: Drop set_generation() method The method became unused since `70e5252a` (table: no longer accept online loading of SSTable files in the main directory) and the whole concept of reshuffling sstables was dropped later by `7351db7c` (Reshape upload files and reshard+reshape at boot). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #12165	2022-12-01 22:17:10 +02:00
Avi Kivity	2978052113	view: reindent maybe_mark_view_as_built Several identation levels were harmed during the preparation of this patch.	2022-12-01 22:09:21 +02:00
Avi Kivity	ac2e2f8883	view: coroutinize maybe_mark_view_as_built Somewhat simplifies complicated logic.	2022-12-01 22:04:51 +02:00
Kamil Braun	cbdcc944b5	service/raft: specialized verb for failure detector pinger We used GOSSIP_ECHO verb to perform failure detection. Now we use a special verb DIRECT_FD_PING introduced for this purpose. There are multiple reasons to do so. One minor reason: we want to use the same connection as other Raft verbs: if we can't deliver Raft append_entries or vote messages somewhere, that endpoint should be marked dead; if we can, the endpoint should be marked alive. So putting pings on the same connection as the other Raft verbs is important when dealing with weird situations where some connections are available but others are not. Observe that in `do_get_rpc_client_idx`, we put the new verb in the right place. Another minor reason: we remove the awkward gossiper `echo_pinger` abstraction which required storing and updating gossiper generation numbers. This also removes one dependency from Raft service code to gossiper. Major reason 1: the gossip echo handler has a weird mechanism where a replacing node returns errors during the replace operation to some of the nodes. In Raft however, we want to mark servers as alive when they are alive, including a server running on a node that's replacing another node. Major reason 2, related to the previous one: when server B is replacing server A with the same IP, the failure detector will try to ping both servers. Both servers are mapped to the same IP by the address map, so pings to both servers will reach server B. We want server B to respond to the pings destined for server B, but not to pings destined for server A, so the sender can mark B alive but keep A marked dead. To do this, we include the destination's Raft ID in our RPCs. The destination compares the received ID with its own. If it's different, it returns a `wrong_destination` response, and the failure detector knows that the ping did not reach the destination (it reached someone else). Yet another reason: removes "Not ready to respond gossip echo message" log spam during replace.	2022-12-01 20:54:18 +01:00
Kamil Braun	02c64becdc	db: system_keyspace: de-staticize `{get,set}_raft_server_id` Part of the anti-globals war.	2022-12-01 20:54:18 +01:00
Kamil Braun	99fe580068	service/raft: make this node's Raft ID available early in group registry Raft ID was loaded or created late in the boot procedure, in `storage_service::join_token_ring`. Create it earlier, as soon as it's possible (when `system_keyspace` is started), pass it to `raft_group_registry::start` and store it inside `raft_group_registry`. We will use this Raft ID stored in group registry in following patches. Also this reduces the number of disk accesses for this node's Raft ID. It's now loaded from disk once, stored in `raft_group_registry`, then obtained from there when needed. This moves `raft_group_registry::start` a bit later in the startup procedure - after `system_keyspace` is started - but it doesn't make a difference.	2022-12-01 20:54:18 +01:00
Nadav Har'El	6fcb5302a6	alternator-test: xfail a flaky test exposing a known bug In a recent commit `757d2a4`, we removed the "xfail" mark from the test test_manual_requests.py::test_too_large_request_content_length because it started to pass on more modern versions of Python, with a urllib3 bug fixed. Unfortunately, the celebration was premature: It turns out that although the test now usually passes, it sometimes fails. This is caused by a Seastar bug scylladb/seastar#1325, which I opened #12166 to track in this project. So unfortunately we need to add the "xfail" mark back to this test. Note that although the test will now be marked "xfail", it will actually pass most of the time, so will appear as "xpass" to people run it. I put a note in the xfail reason string as a reminder why this is happening. Fixes #12143 Refs #12166 Refs scylladb/seastar#1325 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12169	2022-12-01 20:00:46 +02:00
Kamil Braun	3cd035d1b9	test/pylib: scylla_cluster: remove `ScyllaCluster.decommissioned` field The field was not used for anything. We can keep decommissioned server in `stopped` field. In fact it caused us a problem: since recently, we're using `ScyllaCluster.uninstall` to clean-up servers after test suite finishes (previously we were using `ScyllaServer.uninstall` directly). But `ScyllaCluster.uninstall` didn't look into the `decommissioned` field, so if a server got decommissioned, we wouldn't uninstall it, and it left us some unnecessary artifacts even for successful tests. This is now fixed. Closes #12163	2022-12-01 19:07:26 +02:00
Avi Kivity	a4b77a5691	Merge 'Cleanup sstables::test_env's manager usage' from Pavel Emelyanov Mainly this PR removes global db::config and feature service that are used by sstables::test_env as dependencies for embedded sstables_manager. Other than that -- drop unused methods, remove nested test_env-s and relax few cases that use two temp dirs at a time for no gain. Closes #12155 * github.com:scylladb/scylladb: test, utils: Use only one tempdir sstable_compaction_test: Dont create nested envs mutation_reader_test: Remove unused create_sstable() helper tests, lib: Move globals onto sstables::test_env tests: Use sstables::test_env.db_config() to access config features: Mark feature_config_from_db_config const sstable_3_x_test: Use env method to create sst sstable_3_x_test: Indentation fix after previous patch sstable_3_x_test: Use sstable::test_env test: Add config to sstable::test_env creation config: Add constexpr value for default murmur ignore bits	2022-12-01 17:47:25 +02:00
Pavel Emelyanov	4c6bfc078d	code: Use http::re(quest\|ply) instead of httpd:: ones Recent seastar update deprecated those from httpd namespace. fixes: #12142 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #12161	2022-12-01 17:33:35 +02:00
Pavel Emelyanov	adc6ee7ea8	test, utils: Use only one tempdir There's a do_with_cloned_tmp_directory that makes two temp dirs to toss sstables between them. Make it go with just one, all the more so it would resemble existing manipulations aroung staging/ subdir Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-01 13:39:57 +03:00
Pavel Emelyanov	15a7b9cafa	sstable_compaction_test: Dont create nested envs The "compact" test case runs in sstables::test_env and additionally wraps it with another instance provided by do_with_tmp_directory helper. It's simpler to create the temp dir by hand and use outter env. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-01 13:39:56 +03:00
Pavel Emelyanov	69fe5fd054	mutation_reader_test: Remove unused create_sstable() helper Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-01 13:39:54 +03:00
Pavel Emelyanov	400bc2c11d	tests, lib: Move globals onto sstables::test_env There's a bunch of objects that are used by test_env as sstables_manager dependencies. Now when no other code needs those globals they better sit on the test_env next to the manager Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-01 13:39:36 +03:00
Pavel Emelyanov	6a294b9ad6	tests: Use sstables::test_env.db_config() to access config Currently some places use global test config, but it's going to be removed soon, so switch to using config from environment Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-01 13:39:30 +03:00
Pavel Emelyanov	b4e31ad359	features: Mark feature_config_from_db_config const It's in fact such. Other than that, next patch will call it with const config at hand and fail to compile without this fix Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-01 13:39:27 +03:00
Pavel Emelyanov	8178845ef3	sstable_3_x_test: Use env method to create sst Just to make it shorter and conform to other sst env tests Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-01 13:39:19 +03:00
Pavel Emelyanov	8d5d05012e	sstable_3_x_test: Indentation fix after previous patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-01 13:39:09 +03:00
Pavel Emelyanov	6628d801f2	sstable_3_x_test: Use sstable::test_env There are several cases there that construct sstables_manager by hand with the help of a bunch of global dependencies. It's nicer to use existing wrapper. (indentation left broken until next patch) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-01 13:38:46 +03:00
Pavel Emelyanov	1d8c76164f	test: Add config to sstable::test_env creation To make callers (tests) construct it with different options. In particular, one test will soon want to construct it with custom large data handler of its own. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-01 13:38:18 +03:00
Pavel Emelyanov	6d0c8fb6e2	config: Add constexpr value for default murmur ignore bits ... and use in some places of sstable_compaction_test. This will allow getting rid of global test_db_config thing later Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-12-01 13:38:15 +03:00
Botond Dénes	dbd00fd3e9	Merge 'Task manager shard repair tasks' from Aleksandra Martyniuk The PR introduces shard_repair_task_impl which represents a repair task that spans over a single shard repair. repair_info is replaced with shard_repair_task_impl, since both serve similar purpose. Closes #12066 * github.com:scylladb/scylladb: repair: reindent repair: replace repair_info with shard_repair_task_impl repair: move repair_info methods to shard_repair_task_impl repair: rename methods of repair_module repair: change type of repair_module::_repairs repair: keep a reference to shard_repair_task_impl in row_level_repair repair: move repair_range method to shard_repair_task_impl repair: make do_repair_ranges a method of shard_repair_task_impl repair: copy repair_info methods to shard_repair_task_impl repair: corutinize shard task creation repair: define run for shard_repair_task_impl repair: add shard_repair_task_impl	2022-12-01 10:04:31 +02:00
Nadav Har'El	5eda8ce4fd	alternator ttl: in scanning thread, don't retry the same page too many times Since fixing issue #11737, when the expiration scanner times out reading a page of data, it retries asking for the same page instead of giving up on the scan and starting anew later. This retry was infinite - which can cause problems if we have a bug in the code or several nodes down, which can lead to getting hung in the same place in the scan for a very long (potentially infinite) time without making any progress. An example of such a bug was issue #12145, where we forgot to handle shutdowns, so on shutdown of the cluster we just hung forever repeating the same request that will never succeed. It's better in this case to just give up on the current scan, and start it anew (from a random position) later. Refs #12145 (that issue was already fixed, by a different patch which stops the iteration when shutting down - not waiting for an infinite number of iterations and not even one more). Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-11-30 18:42:37 +02:00
Nadav Har'El	d08eef5a30	alternator: fix hang during shutdown of expiration-scanning thread The expiration-scanning thread is a long-running thread which can scan data for hours, but checks for its abort-source before fetching each page to allow for timely shutdown. Recently, we added the ability to retry the page fetching in case of timeout, for forgot to check the abort source in this new retry loop - which lead to an infinitely-long shutdown in some tests while the retry loop retries forever. In this patch we fix this bug by using sleep_abortable() instead of sleep(). sleep_abortable() will throw an exception if the abort source was triggered before or during the sleep - and this exception will stop the scan immediately. Fixes #12145 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2022-11-30 18:38:17 +02:00
Jan Ciolek	05ea0c1d60	dev/docs: add additional git pull to backport docs Botond noted that an additional git pull might be needed here: https://github.com/scylladb/scylladb/pull/12138#discussion_r1035857007 Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-30 16:14:02 +01:00
Jan Ciolek	e74873408b	docs/dev: add a note about cherry-picking individual commits Some people prefer to cherry-pick individual commits so that they have less conflicts to resolve at once. Add a comment about this possibility. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-30 16:06:39 +01:00
Kamil Braun	0f9d0dd86e	Merge 'raft: support IP address change' from Konstantin Osipov This is the core of dynamic IP address support in Raft, moving out the IP address sourcing from Raft Group 0 configuration to gossip. At start of Raft, the raft id <> IP address translation map is tuned into the gossiper notifications and learns IP addresses of Raft hosts from them. The series intentionally doesn't contain the part which speeds up the initial cluster assembly by persisting the translation cache and using more sources besides gossip (discovery, RPC) to show correctness of the approach. Closes #12035 * github.com:scylladb/scylladb: raft: (rpc) do not throw in case of a missing IP address in RPC raft: (address map) actively maintain ip <-> raft server id map	2022-11-30 15:40:18 +01:00
Aleksandra Martyniuk	78a6193c01	repair: reindent	2022-11-30 13:53:52 +01:00
Aleksandra Martyniuk	b4ad914fe1	repair: replace repair_info with shard_repair_task_impl repair_info is deleted and all its attributes are moved to shard_repair_task_impl.	2022-11-30 13:53:52 +01:00
Aleksandra Martyniuk	f6ec2cec92	repair: move repair_info methods to shard_repair_task_impl	2022-11-30 13:53:18 +01:00
Jan Ciolek	32663e6adb	docs/dev: use 'is merged into' instead of 'becomes' The backport instructions said that after passing the tests next `becomes` master, but it's more exact to say that next `is merged into` master. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-30 13:25:10 +01:00
Jan Ciolek	28cf8a18de	docs/dev: mention that new backport instructions are for the contributor Previously the section was called: "How to backport a patch", which could be interpreted as instructions for the maintainer. The new title clearly states that these instructions are for the contributor in case the maintainer couldn't backport the patch by themselves. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-30 13:23:15 +01:00
Takuya ASADA	4ecc08c4fe	docker: switch default locale to C.UTF-8 Since we switched scylla-machine-image locale to C.UTF-8 because ubuntu-minimal image does not have en_US.UTF-8 by default, we should do same on our docker image to reduce image size. Verified #9570 does not occur on new image, since it is still UTF-8 locale. Closes #12122	2022-11-30 13:58:43 +02:00
Anna Stuchlik	15cc3ecf64	doc: update the releases in the KB about updating the mode after upgrade	2022-11-30 12:53:13 +01:00
Anna Stuchlik	242a3916f0	doc: fix the broken link in the 5.1 upgrade guide	2022-11-30 12:49:20 +01:00
Alejo Sanchez	f7aa08ef25	test.py: don't stop cluster's site if not started The site member is created in ScyllaCluster.start(), for startup failure this might not be initialized, so check it's present before stop()ing it. And delete it as it's not running and proper initialization should call ScyllaCluster.start(). Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Closes #11939	2022-11-30 13:47:18 +02:00
Anna Stuchlik	1575d96856	doc: add the link to the 5.1-related KB article to the 5.1 upgrade guide	2022-11-30 12:40:49 +01:00
Nadav Har'El	ce347f4b67	test/cql-pytest: add test for meaning of fetch_size with filtering A question was raised on what fetch_size (the requested page size in a paged scan) counts when there is a filter: does it count the rows before filtering (as scanned from disk) or after filter (as will be returned to the client)? This patch adds a test which demonstrates that Cassandra and Scylla behave differently in this respect: Cassandra counts post-filtering - so fetch_size results are actually returned, while Scylla currently counts pre-filtering. It is arguable which behavior is the "correct" one - we discuss this in issue #12102. But we have already had several users (such as #11340) who complained about Scylla's behavior and expected Cassandra's behavior, so if we decide to keep Scylla's behavior we should at least explain and justify this decision in our documentation. Until then, let's have this test which reminds us of this incompatibility. This test currently passes on Cassandra and fails (xfail) on Scylla. Refs #11340 Refs #12102 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12103	2022-11-30 12:27:06 +02:00
Nadav Har'El	8bd8ef3d03	test/cql-pytest: add regression test for old issue This patch adds a regression test for the old issue #65 which is about a multi-column (tuple) clustering-column relation in a SELECT when one these columns has reversed order. It turns out that we didn't notice, but this issue was already solved - but we didn't have a regression test for it. So this patch adds just a regression test. The test confirms that Scylla now behaves like was desired when that issue was opened. The test also passes on Cassandra, confirming that Scylla and Cassandra behave the same for such requests. Fixes #65 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12130	2022-11-30 12:22:21 +02:00
Michał Jadwiszczak	8e64e18b80	forward_service: add debug logs Adds a few debug logs to see what is happening in https://github.com/scylladb/scylladb/issues/11684 Wrapped `forward_result::printer` into `seastar::value_of` to lazy evaluate the printer Closes #12113	2022-11-30 12:15:26 +02:00
Yaniv Kaul	b66ca3407a	doc: Typo - then -> than Fix a typo. Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com> Closes #12140	2022-11-30 12:03:56 +02:00
Botond Dénes	50aea9884b	Merge 'Improve the Raft upgrade procedure' from Kamil Braun Better logging, less code, a minor fix. Closes #12135 * github.com:scylladb/scylladb: service/raft: raft_group0: less repetitive logging calls service/raft: raft_group0: fix sleep_with_exponential_backoff	2022-11-30 11:24:20 +02:00
Avi Kivity	6a5d9ff261	treewide: use non-experimental std::source_location Now that we use libstdc++ 12, we can use the standardized source_location. Closes #12137	2022-11-30 11:06:43 +02:00
Jan Ciolek	56a802c979	docs/dev: Add backport instructions for contributors Add instructions on how to backport a feature to on older version of Scylla. It contains a detailed step-by-step instruction so that people unfamiliar with intricacies of Scylla's repository organization can easily get the hang of it. This is the guide I wish I had when I had to do my first backport. I put it in backport.md because that looks like the file responsible for this sort of information. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-29 22:10:27 +01:00
Konstantin Osipov	fbe7886cc0	raft: (rpc) do not throw in case of a missing IP address in RPC Remove raft_address_map::get_inet_address() While at it, coroutinize some rpc mehtods. To propagate up the event of missing IP address, use coroutine::exception( with a proper type (raft::transport_error) and a proper error message. This is a building block from removing raft_address_map::get_inet_address() which is too generic, and shifting the responsibility of handling missing addresses to the address map clients. E.g. one-way RPC shouldn't throw if an address is missing, but just drop the message. PS An attempt to use a single template function rendered to be too complex: - some functions require a gate, some don't - some return void, some future<> and some future<raft::data_type>	2022-11-29 19:55:48 +03:00
Konstantin Osipov	73e5298273	raft: (address map) actively maintain ip <-> raft server id map 1) make address map API flexible Before this patch: - having a mapping without an actual IP address was an internal error - not having a mapping for an IP address was an internal error - re-mapping to a new IP address wasn't allowed After this patch: - the address map may contain a mapping without an actual IP address, and the caller must be prepared for it: find() will return a nullopt. This happens when we first add an entry to Raft configuration and only later learn its IP address, e.g. via gossip. - it is allowed to re-map an existing entry to a new address; 2) subscribe to gossip notifications Learning IP addresses from gossip allows us to adjust the address map whenever a node IP address changes. Gossiper is also the only valid source of re-mapping, other sources (RPC) should not re-map, since otherwise a packet from a removed server can remap the id to a wrong address and impact liveness of a Raft cluster. 3) prompt address map state with app state Initialize the raft address map with initial gossip application state, specifically IPs of members of the cluster. With this, we no longer need to store these IPs in Raft configuration (and update them when they change). The obvious drawback of this approach is that a node may join Raft config before it propagates its IP address to the cluster via gossip - so the boot process has to wait until it happens. Gossip also doesn't tell us which IPs are members of Raft configuration, so we subscribe to Group0 configuration changes to mark the members of Raft config "non-expiring" in the address translation map. Thanks to the changes above, Raft configuration no longer stores IP addresses. We still keep the 'server_info' column in the raft_config system table, in case we change our mind or decide to store something else in there.	2022-11-29 19:55:43 +03:00
Kamil Braun	3dbcff435f	service/raft: raft_group0: less repetitive logging calls Some log messages in retry loops in the Raft upgrade procedure included a sentence like "sleeping before retrying..."; but not all of them. With the recently added `sleep_with_exponential_backoff` abstraction we can put this "sleeping..." message in a single place, and it's also easy to say how long we're going to sleep. I also enjoy using this `source_location` thing.	2022-11-29 17:42:43 +01:00
Nadav Har'El	c5121cf273	cql: fix column-name aliases in SELECT JSON The SELECT JSON statement, just like SELECT, allows the user to rename selected columns using an "AS" specification. E.g., "SELECT JSON v AS foo". This specification was not honored: We simply forgot to look at the alias in SELECT JSON's implementation (we did it correctly in regular SELECT). So this patch fixes this bug. We had two tests in cassandra_tests/validation/entities/json_test.py that reproduced this bug. The checks in those tests now pass, but these two tests still continue to fail after this patch because of two other unrelated bugs that were discovered by the same tests. So in this patch I also add a new test just for this specific issue - to serve as a regression test. Fixes #8078 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12123	2022-11-29 18:16:19 +02:00
Avi Kivity	faf11587fa	Update seastar submodule * seastar 4f4cc00660...3a5db04197 (16): > tls: add missing include <map> > Merge 'util/process: use then_unpack to help automatically unpack tuple.' from Jianyong Chen > HTTP: define formatter for status_type to fix build. > fsnotifier: move it into namespace experimental and add docs. > Move fsnotify.hh to the 'include' directory for public use. > Merge 'reactor: define make_pipe() and use make_pipe() in reactor::spawn()' from Kefu Chai > Merge 'Fix: error when compiling http_client_demo' from Amossss > util/process: using `data_sink_impl::put` > Merge 'dns: serialize UDP sends.' from Calle Wilund > build: use correct version when finding liburing > Merge 'Add simple http client' from Pavel Emelyanov > future: use invoke_result instead of nested requirements > Merge 'reactor: use separate calls in reactor and reactor_backend for read/write/sendmsg/recvmsg' from Kefu Chai > util, core: add spawn_process() helper > parallel utils: add note about shard-local parallelism > shared_mutex: return typed exceptional future in with_* error handlers Closes #12131	2022-11-29 18:10:06 +02:00
Kamil Braun	580bdec875	service/raft: raft_group0: fix sleep_with_exponential_backoff It was immediately jumping to _max_retry_period.	2022-11-29 16:27:59 +01:00
Nadav Har'El	6bc3075bbd	test/alternator: increase timeout on TTL tests Some of the tests in test/alternator/test_ttl.py need an expiration scan pass to complete and expire items. In development builds on developer machines, this usually takes less than a second (our scanning period is set to half a second). However, in debug builds on Jenkins each scan often takes up to 100 (!) seconds (this is the record we've seen so far). This is why we set the tests' timeout to 120. But recently we saw another test run failing. I think the problem is that in some case, we need not one, but two scanning passes to complete before the timeout: It is possible that the test writes an item right after the current scan passed it, so it doesn't get expired, and then we a second scan at a random position, possibly making that item we mention one of the last items to be considered - so in total we need to wait for two scanning periods, not one, for the item to expire. So this patch increases the timeout from 120 seconds to 240 seconds - more than twice the highest scanning time we ever saw (100 seconds). Note that this timeout is just a timeout, it's not the typical test run time: The test can finish much more quickly, as little as one second, if items expire quickly on a fast build and machine. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12106	2022-11-29 16:37:54 +03:00
Nadav Har'El	1f8adda4b2	Merge 'treewide: improve compatibility with gcc 12' from Avi Kivity Fix some issues found with gcc 12. Note we can't fully compile with gcc yet, due to [1]. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98056 Closes #12121 * github.com:scylladb/scylladb: utils: observer: qualify seastar::noncopyable_function sstables: generation_type: forgo constexpr on hash of generation_type logalloc: disambiguate types and non-type members task_manager: disambiguate types and non-type members direct_failure_detector: don't change meaning of endpoint_liveness schema: abort on illegal per column computation kind database: abort on illegal per partition rate limit operation mutation_fragment: abort on illegal fragment type per_partition_rate_limit_options: abort on illegal operation type schema: drop unused lambda mutation_partition: drop unused lambda cql3: create_index_statement: remove unused lambda transport: prevent signed and unsigned comparison database: don't compare signed and unsigned types raft: don't compare signed and unsigned types compaction: don't compare signed and unsigned compaction counts bytes_ostream: don't take reference to packed variable	2022-11-29 13:57:24 +02:00
Avi Kivity	ea99750de7	test: give tests less-unique identifiers Test identifiers are very unique, but this makes them less useful in Jenkins Test Result Analyzer view. For example, counter_test can be counter_test.432 in one run and counter_test.442 in another. Jenkins considers them different and so we don't see a trend. Limit the id uniqueness within a test case, so that we'll have counter_test.{1, 2, 3} consistently. Those test will be grouped together so we can see pass/fail trends. Closes #11946	2022-11-29 13:14:14 +02:00
Yaniv Kaul	fef8e43163	doc: cluster management: Replace a misplaced period with a a bulleted list of items Signed-Off-By: Yaniv Kaul <yaniv.kaul@scylladb.com> Closes #12125	2022-11-29 12:42:24 +02:00
Botond Dénes	e9fec761a2	Merge 'doc: document the procedure for updating the mode after upgrade' from Anna Stuchlik Fix https://github.com/scylladb/scylla-docs/issues/4126 Closes #11122 * github.com:scylladb/scylladb: doc: add info about the time-consuming step due to resharding doc: add the new KB to the toctree doc: doc: add a KB about updating the mode in perftune.yaml after upgrade	2022-11-29 12:41:46 +02:00
Avi Kivity	ea901fdb9d	cql3: expr: fold `null` into untyped_constant/constant Our `null` expression, after the prepare stage, is redundant with a `constant` expression containing the value NULL. Remove it. Its role in the unprepared stage is taken over by untyped_constant, which gains a new type_class enumeration to represent it. Some subtleties: - Usually, handling of null and untyped_constant, or null and constant was the same, so they are just folded into each other - LWT "like" operator now has to discriminate between a literal string and a literal NULL - prepare and test_assignment were folded into the corresponing untyped_constant functions. Some care had to be taken to preserve error messages. Closes #12118	2022-11-29 11:02:18 +02:00
Aleksandra Martyniuk	8bc0af9e34	repair: fix double start of data sync repair task Currently, each data sync repair task is started (and hence run) twice. Thus, when two running operations happen within a time frame long enough, the following situation may occur: - the first run finishes - after some time (ttl) the task is unregistered from the task manager - the second run finishes and attempts to finish the task which does not exist anymore - memory access causes a segfault. The second call to start is deleted. A check is added to the start method to ensure that each task is started at most once. Fixes: #12089 Closes #12090	2022-11-29 00:00:10 +02:00
Avi Kivity	9765b2e3bc	cql3: expr: drop remnants of `bool` component from expression In `ad3d2ee47d`, we replaced `bool` as an expression element (representing a boolean constant) with `constant`. But a comment and a concept continue to mention it. Remove the comment and the concept fragment. Closes #12119	2022-11-28 23:18:26 +02:00
Pavel Emelyanov	ae79669fd2	topology: Be less restrictive about missing endpoints Recent changes in topology restricted the get_dc/get_rack calls. Older code was trying to locate the endpoint in gossiper, then in system keyspace cache and if the endpoint was not found in both -- returned "default" location. New code generates internal error in this case. This approach already helped to spot several BUGs in code that had been eventually fixed, but echoes of that change still pop up. This patch relaxes the "missing endpoint" case by printing a warning in logs and returning back the "default" location like old code did. tests: update_cluster_layout_tests.py::* hintedhandoff_additional_test.py::TestHintedHandoff::test_hintedhandoff_rebalance bootstrap_test.py::TestBootstrap::test_decommissioned_wiped_node_can_join bootstrap_test.py::TestBootstrap::test_failed_bootstap_wiped_node_can_join materialized_views_test.py::TestMaterializedViews::test_decommission_node_during_mv_insert_4_nodes refs: #11900 refs: #12054 fixes: #11870 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #12067	2022-11-28 22:01:09 +02:00
Avi Kivity	3a6eafa8c6	utils: observer: qualify seastar::noncopyable_function gcc checks name resolution eagerly, and can't find noncopyable_function as this header doesn't include "seastarx.hh". Qualify the name so it finds it.	2022-11-28 21:58:30 +02:00
Avi Kivity	5ae98ab3de	sstables: generation_type: forgo constexpr on hash of generation_type std::hash isn't constexpr, so gcc refuses to make hash of generation_type constexpr. It's pointless anyway since we never have a compile-time sstable generation.	2022-11-28 21:58:30 +02:00
Avi Kivity	a2d43bb851	logalloc: disambiguate types and non-type members logalloc::tracker has some members with the same names as types from namespace scope. gcc (rightfully) complains that this changes the meaning of the name. Qualify the types to disambiguate.	2022-11-28 21:58:30 +02:00
Avi Kivity	ed5da87930	task_manager: disambiguate types and non-type members task_manager has some members with the same names as types from namespace scope. gcc (rightfully) complains that this changes the meaning of the name. Qualify the types to disambiguate.	2022-11-28 21:58:30 +02:00
Avi Kivity	27be1670d1	direct_failure_detector: don't change meaning of endpoint_liveness It's used both as a type and as a member. Qualify the type so they have different names.	2022-11-28 21:58:30 +02:00
Avi Kivity	735c46cb63	schema: abort on illegal per column computation kind Without memory corruption it's not possible for the switch to fall through, and the compiler will error if we forget to add a case. The compiler however is obliged to consider that we might store some other value in the variable.	2022-11-28 21:58:30 +02:00
Avi Kivity	f73a51250c	database: abort on illegal per partition rate limit operation Without memory corruption it's not possible for the switch to fall through, and the compiler will error if we forget to add a case. The compiler however is obliged to consider that we might store some other value in the variable.	2022-11-28 21:58:30 +02:00
Avi Kivity	f469885b41	mutation_fragment: abort on illegal fragment type Without memory corruption it's not possible for the switch to fall through, and the compiler will error if we forget to add a case. The compiler however is obliged to consider that we might store some other value in the variable.	2022-11-28 21:58:30 +02:00
Avi Kivity	a3c89cedbd	per_partition_rate_limit_options: abort on illegal operation type Without memory corruption it's not possible for the switch to fall through, and the compiler will error if we forget to add a case. The compiler however is obliged to consider that we might store some other value in the variable.	2022-11-28 21:58:30 +02:00
Avi Kivity	7ec28a81bf	schema: drop unused lambda get_cell is defined but not used.	2022-11-28 21:58:30 +02:00
Avi Kivity	c493a2379a	mutation_partition: drop unused lambda should_purge_row_tombstone is defined but not used.	2022-11-28 21:58:30 +02:00
Avi Kivity	e25bf62871	cql3: create_index_statement: remove unused lambda throw_exception is defined but not used.	2022-11-28 21:58:30 +02:00
Avi Kivity	5dedf85288	transport: prevent signed and unsigned comparison This can lead to undefined behavior. Cast to unsigned, after we've verified the value is indeed positive.	2022-11-28 21:58:30 +02:00
Avi Kivity	77be69b600	database: don't compare signed and unsigned types gcc warns it can lead to undefined behavior, though 2G entries in a list of mutations are unlikely. Use the correct type for iteration.	2022-11-28 21:58:30 +02:00
Avi Kivity	fb6804e7a4	raft: don't compare signed and unsigned types gcc warns it can lead to undefined behavior, though 2G entries in a list of mutations are unlikely. Use the correct type for iteration.	2022-11-28 21:58:30 +02:00
Avi Kivity	f565db75ce	compaction: don't compare signed and unsigned compaction counts gcc warns as this can lead to incorrect results. Cast the threshold to an unsigned type (we know it's positive at this point) to avoid the warning.	2022-11-28 21:41:56 +02:00
Avi Kivity	23b94ac391	bytes_ostream: don't take reference to packed variable bytes_ostream is packed, so its _begin member is packed as well. gcc (correctly) disallows taking a reference to an unaligned variable in an aligned refernce, and complains. Make it happy by open-coding the exchange operation.	2022-11-28 21:40:18 +02:00
Nadav Har'El	5480211061	Merge 'test.py: support node replace operation' from Kamil Braun The `add_server` function now takes an optional `ReplaceConfig` struct (implemented using `NamedTuple`), which specifies the ID of the replaced server and whether to reuse the IP address. If we want to reuse the IP address, we don't allocate one using the host registry. This required certain refactors: moving the code responsible for allocation of IPs outside `ScyllaServer`, into `ScyllaCluster`. Add two tests, but they are now skipped: one of them is failing (unability for new node to join group 0) and both suffer from a hardcoded 60-second sleep in Scylla. Closes #12032 * github.com:scylladb/scylladb: test/topology: simple node replace tests (currently disabled) test/pylib: scylla_cluster: support node replace operation test/pylib: scylla_cluster: move members initialization to constructor test/pylib: scylla_cluster: (re)lease IP addr outside ScyllaServer test/pylib: scylla_cluster: refactor create_server parameters to a struct test.py: stop/uninstall clusters instead of servers when cleaning up test/pylib: artifact_registry: replace `Awaitable` type with `Coroutine` test.py: prepare for adding extra config from test when creating servers test/pylib: manager_client: convert `add_server` to use `put_json` test/pylib: rest_client: allow returning JSON data from `put_json` test/pylib: scylla_cluster: don't import from manager_client	2022-11-28 16:06:39 +02:00
Takuya ASADA	4d8fb569a1	install.sh: drop locale workaround from python3 thunk Since #7408 does not occur on current python3 version (3.11.0), let's drop the workarond. Closes #12097	2022-11-28 13:07:03 +02:00
Anna Stuchlik	452915cef6	doc: set the documentation version 5.1 as default (latest) Closes #12105	2022-11-28 12:02:13 +01:00
Avi Kivity	380da0586c	Update tools/python3 submodule (drop locale workaround) * tools/python3 773070e...548e860 (1): > install.sh: drop locale workaround from python3 thunk	2022-11-28 12:24:13 +02:00
Avi Kivity	0da66371a5	storage_proxy: coroutinize inner continuation of create_hint_sync_point() It is part of a coroutine::parallel_for_each(), which is safe for lambda coroutines. Closes #12057	2022-11-28 11:30:00 +02:00
Avi Kivity	d12d42d1a6	Revert "configure: temporarily disable wasm support for aarch64" This reverts commit `e2fe8559ca`. I ran all the release mode tests on aarch64 with it reverted, and it passes. So it looks like whatever problems we had with it were fixed. Closes #12072	2022-11-28 11:30:00 +02:00
Nadav Har'El	99a72a9676	Merge 'cql3: expr: make it possible to evaluate expr::binary_operator' from Jan Ciołek As a part of CQL rewrite we want to be able to perform filtering by calling `evaluate()` on an expression and checking if it evaluates to `true`. Currently trying to do that for a binary operator would result in an error. Right now checking if a binary operation like `col1 = 123` is true is done using `is_satisfied_by`, which is able to check if a binary operation evaluates to true for a small set of predefined cases. Eventually once the grammar is relaxed we will be able to write expressions like: `(col1 < col2) = (1 > ?)`, which doesn't fit with what `is_satisfied_by` is supposed to do. Additionally expressions like `1 = NULL` should evaluate to `NULL`, not `true` or `false`. `is_satsified_by` is not able to express that properly. The proper way to go is implementing `evaluate(binary_operator)`, which takes a binary operation and returns what the result of it would be. Implementing `prepare_expression` for `binary_operator` requires us to be able to evaluate it first. In the next PR I will add support for `prepare_expression`. Closes #12052 * github.com:scylladb/scylladb: cql-pytest: enable two unset value tests that pass now cql-pytest: reduce unset value error message cql3: expr: change unset value error messages to lowercase cql_pytest: ensure that where clauses like token(p) = 0 AND p = 0 are rejected cql3: expr: remove needless braces around switch cases cql3: move evaluation IS_NOT NULL to a separate function expr_test: test evaluating LIKE binary_operator expr_test: test evaluating IS_NOT binary_operator expr_test: test evaluating CONTAINS_KEY binary_operator expr_test: test evaluating CONTAINS binary_operator expr_test: test evaluating IN binary_operator expr_test: test evaluating GTE binary_operator expr_test: test evaluating GT binary_operator expr_test: test evaluating LTE binary_operator expr_test: test evaluating LT binary_operator expr_test: test evaluating NEQ binary_operator expr_test: test evaluating EQ binary_operator cql3: expr properly handle null in is_one_of() cql3: expr properly handle null in like() cql3: expr properly handle null in contains_key() cql3: expr properly handle null in contains() cql3: expr: properly handle null in limits() cql3: expr: remove unneeded overload of limits() cql3: expr: properly handle null in equality operators cql3: expr: remove unneeded overload of equal() cql3: expr: use evaluate(binary_operator) in is_satisfied_by cql3: expr: handle IS NOT NULL when evaluating binary_operator cql3: expr: make it possible to evaluate binary_operator cql3: expr: accept expression as lhs argument to like() cql3: expr: accept expression as lhs in contains_key cql3: expr: accept expression as lhs argument to contains()	2022-11-28 11:30:00 +02:00
Nadav Har'El	1e59c3f9ef	alternator: if TTL scan times out, continue immediately The Alternator TTL expiration scanner scans an entire table using many small pages. If any of those pages time out for some reason (e.g., an overload situation), we currently consider the entire scan to have failed and wait for the next scan period (which by default is 24 hours) when we start the scan from scratch (at a random position). There is a risk that if these timeouts are common enough to occur once or more per scan, the result is that we double or more the effective expiration lag. A better solution, done in this patch, is to retry from the same position if a single page timed out - immediately (or almost immediately, we add a one-second sleep). Fixes #11737 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12092	2022-11-28 11:30:00 +02:00
Avi Kivity	45a57bf22d	Update tools/java submodule (revert scylla-driver) scylla-driver causes dtests to fail randomly (likely due to incorrect handling of the USE statement). Revert it. * tools/java 73422ee114...1c06006447 (2): > Revert "Add Scylla Cloud serverless support" > Revert "Switch cqlsh to use scylla-driver"	2022-11-28 11:29:08 +02:00
Benny Halevy	8f584a9a80	storage_service: handle_state_normal: always update_topology before update_normal_tokens update_normal_tokens checks that that the endpoint is in topology. Currently we call update_topology on this path only if it's not a normal_token_owner, but there are paths when the endpoint could be a normal token owner but still be pending in topology so always update it, just in case. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-28 11:25:36 +02:00
Benny Halevy	6b13fd108a	storage_service: handle_state_normal: delete outdated comment regarding update pending ranges race asias@scylladb.com said: > This comments was moved up to the wrong place when tmptr->update_topology was added. > There is no race now since we use the copy-update-replace method to update token_metadada. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-28 11:25:36 +02:00
Kefu Chai	af011aaba1	utils/variant_element: simplify is_variant_element with right fold for better readability than the recursive approach. Signed-off-by: Kefu Chai <tchaikov@gmail.com> Closes #12091	2022-11-27 16:34:34 +02:00
Avi Kivity	78222ea171	Update tools/java submodule (cqlsh system_distributed_everywhere is a system keyspace) * tools/java 874e2d529b...73422ee114 (1): > Mark "system_distributed_everywhere" as system ks	2022-11-27 15:37:57 +02:00
Aleksandra Martyniuk	9a3d114349	tasks: move methods from task_manager to source file Methods from tasks::task_manager and nested classes are moved to source file. Closes #12064	2022-11-27 15:09:28 +02:00
Piotr Dulikowski	22fbf2567c	utils/abi: don't use the deprecated std::unexpected_handler Recently, clang started complaining about std::unexpected_handler being deprecated: ``` In file included from utils/exceptions.cc:18: ./utils/abi/eh_ia64.hh:26:10: warning: 'unexpected_handler' is deprecated [-Wdeprecated-declarations] std::unexpected_handler unexpectedHandler; ^ /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/exception:84:18: note: 'unexpected_handler' has been explicitly marked deprecated here typedef void (*_GLIBCXX11_DEPRECATED unexpected_handler) (); ^ /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/x86_64-redhat-linux/bits/c++config.h:2343:32: note: expanded from macro '_GLIBCXX11_DEPRECATED' ^ /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/x86_64-redhat-linux/bits/c++config.h:2334:46: note: expanded from macro '_GLIBCXX_DEPRECATED' ^ 1 warning generated. ``` According to cppreference.com, it was deprecated in C++11 and removed in C++17 (!). This commit gets rid of the warning by inlining the std::unexpected_handler typedef, which is defined as a pointer a function with 0 arguments, returning void. Fixes: #12022 Closes #12074	2022-11-27 12:25:20 +02:00
Alejo Sanchez	5ff4b8b5f8	pytest: catch rare exception for random tables test On rare occassions a SELECT on a DROPpped table throws cassandra.ReadFailure instead of cassandra.InvalidRequest. This could not be reproduced locally. Catch both exceptions as the table is not present anyway and it's correctly marked as a failure. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Closes #12027	2022-11-27 10:26:55 +02:00
Michał Chojnowski	a75e4e1b23	db: config: disable global index page caching by default Global index page caching, as introduced in 4.6 (`078a6e422b` and `9f957f1cf9`) has proven to be misdesigned, because it poses a risk of catastrophic performance regressions in common workloads by flooding the cache with useless index entries. Because of that risk, it should be disabled by default. Refs #11202 Fixes #11889 Closes #11890	2022-11-26 14:27:26 +02:00
Aleksandra Martyniuk	c2ea3f49e6	repair: rename methods of repair_module Methods of repair_module connected with repair_module::_repairs are renamed to match repair_module::_repairs type.	2022-11-25 16:41:02 +01:00
Aleksandra Martyniuk	13dbd75ba8	repair: change type of repair_module::_repairs As a preparation to replacing repair_info with shard_repair_task_impl, type of _repairs in repair module is changed from std::unordered_map<int, lw_shared_ptr<repair_info>> to std::unordered_map<int, tasks::task_id>.	2022-11-25 16:41:02 +01:00
Aleksandra Martyniuk	55c01a1beb	repair: keep a reference to shard_repair_task_impl in row_level_repair As a part of replacing repair_info with shard_repair_task_impl, instead of a reference to repair_info, row_level_repair keeps a reference to shard_repair_task_impl.	2022-11-25 16:41:02 +01:00
Aleksandra Martyniuk	9b664570f0	repair: move repair_range method to shard_repair_task_impl	2022-11-25 16:41:02 +01:00
Aleksandra Martyniuk	3ac5ba7b28	repair: make do_repair_ranges a method of shard_repair_task_impl Function do_repair_ranges is directly connected to shard repair tasks. Turning it into shard_repair_task_impl method enables an access to tasks' members with no additional intermediate layers.	2022-11-25 16:41:02 +01:00
Aleksandra Martyniuk	a09dfcdacd	repair: copy repair_info methods to shard_repair_task_impl Methods of repair_info are copied to shard_repair_task_impl. They are not used yet, it's a preparation for replacing repair_info with shard_repair_task_impl.	2022-11-25 16:41:02 +01:00
Aleksandra Martyniuk	a4b1bdb56c	repair: corutinize shard task creation	2022-11-25 16:41:02 +01:00
Aleksandra Martyniuk	996c0f3476	repair: define run for shard_repair_task_impl Operations performed as a part of shard repair are moved to shard_repair_task_impl run method.	2022-11-25 16:41:02 +01:00
Aleksandra Martyniuk	ba9770ea02	repair: add shard_repair_task_impl Create a task spanning over a repair performed on a given shard.	2022-11-25 16:40:49 +01:00
Anna Stuchlik	d5f676106e	doc: remove the LWT page from the index of Enterprise features Closes #12076	2022-11-24 21:59:05 +02:00
Aleksandra Martyniuk	dcc17037c7	repair: fix bad cast in tasks::task_id parsing In system_keyspace::get_repair_history value of repair_uuid is got from row as tasks::task_id. tasks::task_id is represented by an abstract_type specific for utils::UUID. Thus, since their typeids differ, bad_cast is thrown. repair_uuid is got from row as utils::UUID and then cast. Since no longer needed, data_type_for<tasks::task_id> is deleted. Fixes: #11966 Closes #12062	2022-11-24 19:37:44 +02:00
Jan Ciolek	77c7d8b8f6	cql-pytest: enable two unset value tests that pass now While implementing evaluate(binary_operator) missing checks for unset value were added for comparisons in filtering code. Because of that some tests for unset value started passing. There are still other tests for unset value that are failing because Scylla doesn't have all the checks that it should. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-24 17:07:17 +01:00
Jan Ciolek	5bc0bc6531	cql-pytest: reduce unset value error message When unset value appears in an invalid place both Cassandra and Scylla throw an error. The tests were written with Cassandra and thus the expected error messages were exactly the same as produced by Cassandra. Scylla produces different error messages, but both databases return messages with the text 'unset value'. Reduce the expected message text from the whole message to something that contains 'unset value'. It would be hard to mimic Cassandra's error messages in Scylla. There is no point in spending time on that. Instead it's better to modify the tests so that they are able to work with both Cassandra and Scylla. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-24 17:04:07 +01:00
Jan Ciolek	08f40a116d	cql3: expr: change unset value error messages to lowercase The messages used to contain UNSET_VALUE in capital letters, but the tests expect messages with 'unset value'. Change the message so that it can match the expected error text in tests. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-24 17:02:44 +01:00
Kamil Braun	fda6403b29	test/topology: simple node replace tests (currently disabled) Add two node replace tests using the freshly added infrastructure. One test replaces a node while using a different IP. It is disabled because the replace operation has an unconditional 60-seconds sleep (it doesn't depend on the ring_delay setting for some reason). The sleep needs to be fixed before we can enable this test. The other test replaces while reusing the replaced node's IP. Additionally to the sleep, the test fails because the node cannot join group 0; it's stuck in an infinite loop of trying to join: ``` INFO 2022-11-18 15:56:19,933 [shard 0] raft_group0 - server 8de951fd-a528-4a82-ac54-592ea269537f found no local group 0. Discovering... INFO 2022-11-18 15:56:19,933 [shard 0] raft_group0 - server 8de951fd-a528-4a82-ac54-592ea269537f found group 0 with group id 25d2b050-6751-11ed-b534-c3c40c275dd3, leader b7047f7e-03e6-4797-a723-24054201f91d INFO 2022-11-18 15:56:19,934 [shard 0] raft_group0 - Server 8de951fd-a528-4a82-ac54-592ea269537f is starting group 0 with id 25d2b050-6751-11ed-b534-c3c40c275dd3 WARN 2022-11-18 15:56:20,935 [shard 0] raft_group0 - failed to modify config at peer b7047f7e-03e6-4797-a723-24054201f91d: seastar::rpc::timeout_error (rpc call timed out). Retrying. INFO 2022-11-18 15:56:21,937 [shard 0] raft_group0 - server 8de951fd-a528-4a82-ac54-592ea269537f found group 0 with group id 25d2b050-6751-11ed-b534-c3c40c275dd3, leader ee0175ea-6159-4d4c-9d7c-95c934f8a408 WARN 2022-11-18 15:56:22,937 [shard 0] raft_group0 - failed to modify config at peer ee0175ea-6159-4d4c-9d7c-95c934f8a408: seastar::rpc::timeout_error (rpc call timed out). Retrying. INFO 2022-11-18 15:56:23,938 [shard 0] raft_group0 - server 8de951fd-a528-4a82-ac54-592ea269537f found group 0 with group id 25d2b050-6751-11ed-b534-c3c40c275dd3, leader ee0175ea-6159-4d4c-9d7c-95c934f8a408 WARN 2022-11-18 15:56:24,939 [shard 0] raft_group0 - failed to modify config at peer ee0175ea-6159-4d4c-9d7c-95c934f8a408: seastar::rpc::timeout_error (rpc call timed out). Retrying. ``` and so on.	2022-11-24 16:26:23 +01:00
Kamil Braun	2f60550ff3	test/pylib: scylla_cluster: support node replace operation The `add_server` function now takes an optional `ReplaceConfig` struct (implemented using `NamedTuple`), which specifies the ID of the replaced server and whether to reuse the IP address. If we want to reuse the IP address, we don't allocate one using the host registry. Since now multiple servers can have the same IP, introduce a `leased_ips` set to `ScyllaCluster` which is used when `uninstall`ing the cluster - to make sure we don't `release_host` the same host twice.	2022-11-24 16:26:23 +01:00
Kamil Braun	d80247f912	test/pylib: scylla_cluster: move members initialization to constructor Previously some members had to be initialized in `install` because that's when we first knew the IP address. Now we know the IP address during construction, which allows us to make the code a bit shorter and simpler, and establish invariants: some members (such as `self.config`) are now valid for the entire lifetime of the server object. `install()` is reduced to performing only side effects (creating directories, writing config files), all calculation is done inside the constructor.	2022-11-24 16:26:23 +01:00
Kamil Braun	3934eefd20	test/pylib: scylla_cluster: (re)lease IP addr outside ScyllaServer `ScyllaServer`s were constructed without IP addresses. They leased an IP address from `HostRegistry` and released them in `uninstall`. This responsibility was now moved into `ScyllaCluster`, which leases an IP address for a server before constructing it, and passes it to the constructor. It releases the addresses of its serverswhen uninstalling itself. This will allow the cluster to reuse the IP address of an existing server in that cluster when adding a new server which wants to replace the existing one. Instead of leasing a new address, it will pass the existing IP address to the new server's constructor. The refactor is also nice in that it establishes an invariant for `ScyllaServer`, simplifying reasoning about the class: now it has an `ip_addr` field at all times. `host_registry` was moved from `ScyllaServer` to `ScyllaCluster`.	2022-11-24 16:26:23 +01:00
Kamil Braun	9d5e1191da	test/pylib: scylla_cluster: refactor create_server parameters to a struct `ScyllaCluster` constructor takes a function `create_server` which itself takes 3 parameters now. Soon it will take a 4th. The list of parameters is repeated at the constructor definition and the call site of the constructor, with many parameters it begins being tiresome. Refactor the list of parameters to a `NamedTuple`.	2022-11-24 16:26:23 +01:00
Kamil Braun	d582666293	test.py: stop/uninstall clusters instead of servers when cleaning up `self.artifacts` was calling `ScyllaServer.stop` and `ScyllaServer.uninstall`. Now it calls `ScyllaCluster.stop` and `ScyllaCluster.uninstall`, which underneath stops/uninstalls servers in this cluster. We must be a bit more careful now in case installing/starting a server inside a cluster fails: there are no server cleanup artifacts, and a server is added to cluster's `running` map only after `install_and_start` finishes (until that happens, `ScyllaCluster.stop/uninstall` won't catch this server). So handle failures explicitly in `install_and_start`. This commit does not logically change how the tests are running - every started server belongs to some cluster, so it will be cleaned up - but it's an important refactor. It will allow us to move IP address (de)allocation code outside `ScyllaServer`, into `ScyllaCluster`, which in turn will allow us to implement node replace operation for the case where we want to reuse the replaced node's IP. Also, `ScyllaCluster.uninstall` was unused before this change, now it's used.	2022-11-24 16:26:17 +01:00
Avi Kivity	29a4b662f8	Merge 'doc: document the Alternator TTL feature as GA' from Anna Stuchlik Currently, TTL is listed as one of the experimental features: https://docs.scylladb.com/stable/alternator/compatibility.html#experimental-api-features This PR moves the feature description from the Experimental Features section to a separate section. I've also added some links and improved the formatting. @tzach I've relied on your release notes for RC1. Refs: https://github.com/scylladb/scylladb/issues/5060 Closes #11997 * github.com:scylladb/scylladb: Update docs/alternator/compatibility.md doc: update the link to Enabling Experimental Features doc: remove the note referring to the previous ScyllaDB versions and add the relevant limitation to the paragraph doc: update the links to the Enabling Experimental Features section doc: add the link to the Enabling Experimental Features section doc: move the TTL Alternator feature from the Experimental Features section to the production-ready section	2022-11-24 17:22:05 +02:00
Nadav Har'El	2dedb5ea75	alternator: make Alternator TTL feature no longer "experimental" Until now, the Alternator TTL feature was considered "experimental", and had to be manually enabled on all nodes of the cluster to be usable. This patch removes this requirement and in essence GAs this feature. Even after this patch, Alternator TTL is still a "cluster feature", i.e., for this feature to be usable every node in the cluster needs to support it. If any of the nodes is old and does not yet support this feature, the UpdateTimeToLive request will not be accepted, so although the expiration-scanning threads may exist on the newer nodes, they will not do anything because none of the tables can be marked as having expiration enabled. This patch does not contain documentation fixes - the documentation still suggests that the Alternator TTL feature is experimental. The documentation patch will come separately. Fixes #12037 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12049	2022-11-24 17:21:39 +02:00
Tzach Livyatan	e96d31d654	docs: Add Authentication and Authorization as a prerequisite for Auditing. Closes #12058	2022-11-24 17:21:23 +02:00
Kamil Braun	df731a5b0c	test/pylib: artifact_registry: replace `Awaitable` type with `Coroutine` The `cleanup_before_exit` method of `ArtifactRegistry` calls `close()` on artifacts. mypy complains that `Awaitable` has no such method. In fact, the `artifact` objects that we pass to `ArtifactRegistry` (obtained by calling `async def` functions) do have a `close()` method, and they are a particular case of `Awaitable`s, but in general not all `Awaitable`s have `close()`. Replace `Awaitable` with one of its subtypes: `Coroutine`. `Coroutine`s have a `close()` method, and `async def` functions return objects of this type. mypy no longer complains.	2022-11-24 16:17:05 +01:00
Nadav Har'El	c6bb64ab0e	Merge 'Fix LWT insert crash if clustering key is null' from Gusev Petr [PR](https://github.com/scylladb/scylladb/pull/9314) fixed a similar issue with regular insert statements but missed the LWT code path. It's expected behaviour of `modification_statement::create_clustering_ranges` to return an empty range in this case, since `possible_lhs_values` it uses explicitly returns `empty_value_set` if it evaluates `rhs` to null, and it has a comment about it (All NULL comparisons fail; no column values match.) On the other hand, all components of the primary key are required to be set, this is checked at the prepare phase, in `modification_statement::process_where_clause`. So the only problem was `modification_statement::execute_with_condition` was not expecting an empty `clustering_range` in case of a null clustering key. Also this patch contains a fix for the problem with wrong column name in Scylla error messages. If `INSERT` or `DELETE` statement is missing a non-last element of the primary key, the error message generated contains an invalid column name. The problem occurs if the query contains a column with the list type, otherwise `statement_restrictions::process_clustering_columns_restrictions` checks that all the components of the key are specified. Closes #12047 * github.com:scylladb/scylladb: cql: refactor, inline modification_statement::validate_primary_key_restrictions cql: DELETE with null value for IN parameter should be forbidden cql: add column name to the error message in case of null primary key component cql: batch statement, inserting a row with a null key column should be forbidden cql: wrong column name in error messages modification_statement: fix LWT insert crash if clustering key is null	2022-11-24 16:15:27 +02:00
Nadav Har'El	6e9f739f19	Merge 'doc: add the links to the per-partition rate limit extension ' from Anna Stuchlik Release 5.1. introduced a new CQL extension that applies to the CREATE TABLE and ALTER TABLE statements. The ScyllaDB-specific extensions are described on a separate page, so the CREATE TABLE and ALTER TABLE should include links to that page and section. Note: CQL extensions are described with Markdown, while the Data Definition page is RST. Currently, there's no way to link from an RST page to an MD subsection (using a section heading or anchor), so a URL is used as a temporary solution. Related: https://github.com/scylladb/scylladb/pull/9810 Closes #12070 * github.com:scylladb/scylladb: doc: move the info about per-partition rate limit for the ALTER TABLE statemet from the paragraph to the list doc: add the links to the per-partition rate limit extention to the CREATE TABLE and ALTER TABLE sections	2022-11-24 16:03:30 +02:00
Anna Stuchlik	8049670772	doc: move the info about per-partition rate limit for the ALTER TABLE statemet from the paragraph to the list	2022-11-24 14:42:11 +01:00
Anna Stuchlik	57a58b17a8	doc: enable publishing the documentation for version 5.1 Closes #12059	2022-11-24 13:55:25 +02:00
Kamil Braun	2f99f27c14	docs/dev: building.md: mention node-exporter packages	2022-11-24 12:49:34 +01:00
Kamil Braun	b12f331fe6	docs/dev: building.md: replace `dev` with `<mode>` in list of debs	2022-11-24 12:47:09 +01:00
Benny Halevy	243dc2efce	hints: host_filter: check topology::has_endpoint if enabled_selectively Don't call get_datacenter(ep) without checking first has_endpoint(ep) since the former may abort on internal error if the endpoint is not listed in topology. Refs #11870 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #12054	2022-11-24 14:33:06 +03:00
Anna Stuchlik	f158d31e24	doc: add the links to the per-partition rate limit extention to the CREATE TABLE and ALTER TABLE sections	2022-11-24 11:26:33 +01:00
Petr Gusev	b95305ae2b	cql: refactor, inline modification_statement::validate_primary_key_restrictions The function didn't add much value, just forwarded to _restrictions. Removed it and called _restrictions->validate_primary_key directly.	2022-11-23 21:56:12 +04:00
Petr Gusev	f9936bb0cb	cql: DELETE with null value for IN parameter should be forbidden If a DELETE statement contains an IN operator and the parameter value for it is NULL, this should also trigger an error. This is in line with how Cassandra behaves in this case.	2022-11-23 21:39:23 +04:00
Petr Gusev	c123f94110	cql: add column name to the error message in case of null primary key component It's more user-friendly and the error message corresponds to what Cassandra provides in this case.	2022-11-23 21:39:23 +04:00
Petr Gusev	7730c4718e	cql: batch statement, inserting a row with a null key column should be forbidden Regular INSERT statements with null values for primary key components are rejected by Scylla since #9286 and #9314. Batch statements missed a similar check, this patch fixes it. Fixes: #12060	2022-11-23 21:39:23 +04:00
Petr Gusev	89a5397d7c	cql: wrong column name in error messages If INSERT or DELETE statement is missing a non-last element of the primary key, the error message generated contains an invalid column name. The problem occurs if the query contains a column with the list type, otherwise statement_restrictions::process_clustering_columns_restrictions checks that all the components of the key are specified. Fixes: #12046	2022-11-23 21:39:16 +04:00
Benny Halevy	996eac9569	topology: add get_datacenters Returns an unordered set of datacenter names to be used by network_topology_replication_strategy and for ks_prop_defs. The set is kept in sync with _dc_endpoints. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #12023	2022-11-23 18:39:36 +02:00
Takuya ASADA	9acdd3af23	dist: drop deprecated AMI parameters on setup scripts Since we moved all IaaS code to scylla-machine-image, we nolonger need AMI variable on sysconfig file or --ami parameter on setup scripts, and also never used /etc/scylla/ami_disabled. So let's drop all of them from Scylla core core. Related with scylladb/scylla-machine-image#61 Closes #12043	2022-11-23 17:56:13 +02:00
Avi Kivity	7c66fdcad1	Merge 'Simplify sstable_directory configuration' from Pavel Emelyanov When started the sstable_directory is constructed with a bunch of booleans that control the way its process_sstable_dir method works. It's shorter and simpler to pass these booleans into method directly, all the more so there's another flag that's already passed like this. Closes #12005 * github.com:scylladb/scylladb: sstable_directory: Move all RAII booleans onto flags sstable_directory: Convert sort-sstables argument to flags struct sstable_directory: Drop default filter	2022-11-23 16:16:04 +02:00
Avi Kivity	70bfa708f5	storage_proxy: coroutinize change_hints_host_filter() Trivial straight-line code, no performance implications. Closes #12056	2022-11-23 15:34:24 +02:00
Jan Ciolek	84501851eb	cql_pytest: ensure that where clauses like token(p) = 0 AND p = 0 are rejected Scylla doesn't support combining restrictions on token with other restrictions on partition key columns. Some pieces of code depend on the assumption that such combinations are allowed. In case they were allowed in the future these functions would silently start returning wrong results, and we would return invalid rows. Add a test that will start failing once this restriction is removed. It will warn the developer to change the functions that used to depend on the assumption. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-23 13:09:22 +01:00
Botond Dénes	602dfdaf98	Merge 'Task manager top level repair tasks' from Aleksandra Martyniuk The PR introduces top level repair tasks representing repair and node operations performed with repair. The actions performed as a part of these operations are moved to corresponding tasks' run methods. Also a small change to repair module is added. Closes #11869 * github.com:scylladb/scylladb: repair: define run for data_sync_repair_task_impl repair: add data_sync_repair_task_impl tasks: repair: add noexcept to task impl constructor repair: define run for user_requested_repair_task_impl repair: add user_requested_repair_task_impl repair: allow direct access to max_repair_memory_per_range	2022-11-23 14:02:30 +02:00
Jan Ciolek	338af848a8	cql3: expr: remove needless braces around switch cases Originally put braces around the cases because there were local variables that I didn't want to be shadowed. Now there are no variables so the braces can be removed without any problems. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-23 12:44:30 +01:00
Jan Ciolek	e8a46d34c2	cql3: move evaluation IS_NOT NULL to a separate function When evaluating a binary operation with operations like EQUAL, LESS_THAN, IN the logic of the operation is put in a separate function to keep things clean. IS_NOT NULL is the only exception, it has its evaluate implementation right in the evaluate(binary_operator) function. It would be cleaner to have it in a separate dedicated function, so it's moved to one. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-23 12:44:30 +01:00
Jan Ciolek	b6cf6e6777	expr_test: test evaluating LIKE binary_operator Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-23 12:44:29 +01:00
Jan Ciolek	6774272fd6	expr_test: test evaluating IS_NOT binary_operator Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-23 12:44:29 +01:00
Jan Ciolek	e6c78bb6c2	expr_test: test evaluating CONTAINS_KEY binary_operator Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-23 12:44:29 +01:00
Jan Ciolek	4f250609ab	expr_test: test evaluating CONTAINS binary_operator Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-23 12:44:29 +01:00
Jan Ciolek	3ca04cfcc2	expr_test: test evaluating IN binary_operator Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-23 12:44:28 +01:00
Jan Ciolek	41f452b73f	expr_test: test evaluating GTE binary_operator Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-23 12:44:28 +01:00
Jan Ciolek	1fe9a9ce2a	expr_test: test evaluating GT binary_operator Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-23 12:44:28 +01:00
Jan Ciolek	ef2a77a3e0	expr_test: test evaluating LTE binary_operator Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-23 12:44:28 +01:00
Jan Ciolek	3cbb2d44e8	expr_test: test evaluating LT binary_operator Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-23 12:44:27 +01:00
Jan Ciolek	9feee70710	expr_test: test evaluating NEQ binary_operator Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-23 12:44:27 +01:00
Jan Ciolek	e77dba0b0b	expr_test: test evaluating EQ binary_operator Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-23 12:44:27 +01:00
Jan Ciolek	63a89776a1	cql3: expr properly handle null in is_one_of() Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-23 12:44:27 +01:00
Jan Ciolek	214dab9c77	cql3: expr properly handle null in like() Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-23 12:44:26 +01:00
Jan Ciolek	2ce9c95a9d	cql3: expr properly handle null in contains_key() Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-23 12:44:26 +01:00
Jan Ciolek	336ad61aa3	cql3: expr properly handle null in contains() Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-23 12:44:26 +01:00
Jan Ciolek	e2223be1ec	cql3: expr: properly handle null in limits() Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-23 12:44:26 +01:00
Jan Ciolek	d1abf2e168	cql3: expr: remove unneeded overload of limits() There is a more general version of limits() which takes expressions as both the lhs and rhs arguments. There is no need for a specialized overload. This specialized overload takes a tuple_constructor as lhs, but we call evaluate() on both sides of a binary operator before checking equality, so this won't be useful at all. Having multiple functions increases the risk that one of them has a bug, while giving dubious benfit. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-23 12:44:25 +01:00
Jan Ciolek	0609a425e6	cql3: expr: properly handle null in equality operators Expressions like: 123 = NULL NULL = 123 NULL = NULL NULL != 123 should be tolerated, but evaluate to NULL. The current code assumes that a binary operator can only evaluate to a boolean - true or false. Now a binary operator can also evaluate to NULL. This should happen in cases when one of the operator's sides is NULL. A special class is introduced to represent a value that can be one of three things: true, false or null. It's better than using std::optional<bool>, because optional has implicit conversions to bool that could cause confusion and bugs. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-23 12:44:22 +01:00
Aleksandra Martyniuk	a3016e652f	repair: define run for data_sync_repair_task_impl Operations performed as a part of data sync repair are moved to data_sync_repair_task_impl run method.	2022-11-23 10:44:19 +01:00
Aleksandra Martyniuk	42239c8fed	repair: add data_sync_repair_task_impl Create a task spanning over whole node operation. Tasks of that type are stored on shard 0.	2022-11-23 10:19:53 +01:00
Aleksandra Martyniuk	9e108a2490	tasks: repair: add noexcept to task impl constructor Add noexcept to constructor of tasks::task_manager::task::impl and inheriting classes.	2022-11-23 10:19:53 +01:00
Aleksandra Martyniuk	4a4e9c12df	repair: define run for user_requested_repair_task_impl Operations performed as a part of user requested repair are moved to user_requested_repair_task_impl run method.	2022-11-23 10:19:51 +01:00
Aleksandra Martyniuk	3800b771fc	repair: add user_requested_repair_task_impl Create a task spanning over whole user requested repair. Tasks of that type are stored on shard 0.	2022-11-23 10:11:09 +01:00
Aleksandra Martyniuk	0256ede089	repair: allow direct access to max_repair_memory_per_range Access specifier of constexpr value max_repair_memory_per_range in repair_module is changed to public and its getter is deleted.	2022-11-23 10:11:09 +01:00
Anna Stuchlik	16e2b9acd4	Update docs/alternator/compatibility.md Co-authored-by: Daniel Lohse <info@asapdesign.de>	2022-11-23 09:51:04 +01:00
Avi Kivity	d7310fd083	gdb: messaging: print tls servers too Many systems have most traffic on tls servers, so print them. Closes #12053	2022-11-23 07:59:02 +02:00
Avi Kivity	aec9faddb1	Merge 'storage_proxy: use erm topology' from Benny Halevy When processing a query, we keep a pointer to an effective_replication_map. In a couple places we used the latest topology instead of the one held by the effective_replication_map that the query uses and that might lead to inconsistencies if, for example, a node is removed from topology after decommission that happens concurrently to the query. This change gets the topology& from the e_r_m in those cases. Fixes #12050 Closes #12051 * github.com:scylladb/scylladb: storage_proxy: pass topology& to sort_endpoints_by_proximity storage_proxy: pass topology& to is_worth_merging_for_range_query	2022-11-22 20:04:41 +02:00
Botond Dénes	49ec7caf27	mutation_fragment_stream_validator: avoid allocation when stream is correct Currently the ctor of said class always allocates as it copies the provided name string and it creates a new name via format(). We want to avoid this, now that the validator is used on the read path. So defer creating the formatted name to when we actually want to log something, which is either when log level is debug or when an error is found. We don't care about performance in either case, but we do care about it on the happy path. Further to the above, provide a constructor for string literal names and when this is used, don't copy the name string, just save a view to it. Refs: #11174 Closes #12042	2022-11-22 19:19:18 +02:00
Nadav Har'El	ce7c1a6c52	Merge 'alternator: fix wrong 'where' condition for GSI range key' from Marcin Maliszkiewicz Contains fixes requested in the issue (and some tiny extras), together with analysis why they don't affect the users (see commit messages). Fixes [ #11800](https://github.com/scylladb/scylladb/issues/11800) Closes #11926 * github.com:scylladb/scylladb: alternator: add maybe_quote to secondary indexes 'where' condition test/alternator: correct xfail reason for test_gsi_backfill_empty_string test/alternator: correct indentation in test_lsi_describe alternator: fix wrong 'where' condition for GSI range key	2022-11-22 17:46:52 +02:00
Pavel Emelyanov	22133a3949	sstable_directory: Move all RAII booleans onto flags There's a bunch of booleans that control the behavior of sstable directory scanning. Currently they are described as verbose bool_class<>-es and are put into sstable_directory construction time. However, these are not used outside of .process_sstable_dir() method and moving them onto recently added flags struct makes the code much shorter (29 insertions(+), 121 deletions(-)) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-11-22 18:30:00 +03:00
Pavel Emelyanov	7ca5e143d7	sstable_directory: Convert sort-sstables argument to flags struct The sstable_directory::process_sstable_dir() accepts a boolean to control its behavior when collecting sstables. Turn this boolean into a structure of flags. The intention is to extend this flags set in the future (next patch). This boolean is true all the time, but one place sets it to true in a "verbose" manner, like this: bool sort_sstables_according_to_owner = false; process_sstable_dir(directory, sort_sstables_according_to_owner).get(); the local variable is not used anymore. Using designated initializers solves the verbosity in a nicer manner. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-11-22 18:19:23 +03:00
Pavel Emelyanov	7c7017d726	sstable_directory: Drop default filter It's used as default argument for .reshape() method, but callers specify it explicitly. At the same time the filter is simple enough and is only used in one place so that the caller can just use explicit lambda. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-11-22 18:19:23 +03:00
Jan Ciolek	6be142e3a0	cql3: expr: remove unneeded overload of equal() There is a more general version of equal() which takes expressions as both the lhs and rhs arguments. There is no need for a specialized overload. This specialized overload takes a tuple_constructor as lhs, but we call evaluate() on both sides of a binary operator before checking equality, so this won't be useful at all. Having multiple functions increases the risk that one of them has a bug, while giving dubious benfit. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-22 14:28:10 +01:00
Benny Halevy	731a74c71f	storage_proxy: pass topology& to sort_endpoints_by_proximity It mustn't use the latest topology that may differ from the one used by the query as it may be missing nodes (e.g. after concurrent decommission). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-22 15:02:40 +02:00
Benny Halevy	ab3fc1e069	storage_proxy: pass topology& to is_worth_merging_for_range_query It mustn't use the latest topology that may differ from the one used by the query as it may be missing nodes (e.g. after concurrent decommission). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-22 15:01:58 +02:00
Petr Gusev	0d443dfd16	modification_statement: fix LWT insert crash if clustering key is null PR #9314 fixed a similar issue with regular insert statements but missed the LWT code path. It's expected behaviour of modification_statement::create_clustering_ranges to return an empty range in this case, since possible_lhs_values it uses explicitly returns empty_value_set if it evaluates rhs to null, and it has a comment about it (All NULL comparisons fail; no column values match.) On the other hand, all components of the primary key are required to be set, this is checked at the prepare phase, in modification_statement::process_where_clause. So the only problem was modification_statement::execute_with_condition was not expecting an empty clustering_range in case of a null clustering key. Fixes: #11954	2022-11-22 16:45:16 +04:00
Marcin Maliszkiewicz	2bf2ffd3ed	alternator: add maybe_quote to secondary indexes 'where' condition This bug doesn't affect anything, the reason is descibed in the commit: 'alternator: fix wrong 'where' condition for GSI range key'. But it's theoretically correct to escape those key names and the difference can be observed via CQL's describe table. Before the patch 'where' condition is missing one double quote in variable name making it mismatched with corresponding column name.	2022-11-22 11:08:23 +01:00
Marcin Maliszkiewicz	4389baf0d9	test/alternator: correct xfail reason for test_gsi_backfill_empty_string Previously cited issue is closed already.	2022-11-22 11:08:23 +01:00
Marcin Maliszkiewicz	59eca20af1	test/alternator: correct indentation in test_lsi_describe Otherwise I think assert is not executed in a loop. And I am not sure why lsi variable can be bound to anything. As I tested it was pointing to the last element in lsis...	2022-11-22 11:08:23 +01:00
Marcin Maliszkiewicz	d6d20134de	alternator: fix wrong 'where' condition for GSI range key This bug doesn't manifest in a visible way to the user. Adding the index to an existing table via GlobalSecondaryIndexUpdates is not supported so we don't need to consider what could happen for empty values of index range key. After the index is added the only interesting value user can set is omitting the value (null or empty are not allowed, see test_gsi_empty_value and test_gsi_null_value). In practice no matter of 'where' condition the underlaying materialized view code is skipping row updates with missing keys as per this comment: 'If one of the key columns is missing, set has_new_row = false meaning that after the update there will be no view row'. Thats why the added test passes both before and after the patch. But it's still usefull to include it to exercise those code paths. Fixes #11800	2022-11-22 11:08:23 +01:00
Nadav Har'El	ff617c6950	cql-pytest: translate a few small Cassandra tests This patch includes a translation of several additional small test files from Cassandra's CQL unit test directory cql3/validation/operations. All tests included here pass on both Cassandra and Scylla, so they did not discover any new Scylla bugs, but can be useful in the future as regression tests. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12045	2022-11-22 07:54:13 +02:00
Botond Dénes	f3eecb47f6	Merge 'Optimize cleanup compaction get ranges for invalidation' from Benny Halevy Take advantage of the facts that both the owned ranges and the initial non_owned_ranges (derived from the set of sstables) are deoverlapped and sorted by start token to turn the calculation of the final non_owned_ranges from quadratic to linear. Fixes #11922 Closes #11903 * github.com:scylladb/scylladb: dht: optimize subtract_ranges compaction: refactor dht::subtract_ranges out of get_ranges_for_invalidation compaction_manager: needs_cleanup: get first/last tokens from sstable decorated keys	2022-11-22 06:45:01 +02:00
Jan Ciolek	a1407ef576	cql3: expr: use evaluate(binary_operator) in is_satisfied_by is_satisfied_by has to check if a binary_operator is satisfied by some values. It used to be impossible to evaluate a binary_operator, so is_satisfied had code to check if its satisfied for a limited number of cases occuring when filtering queries. Now evaluate(binary_operator) has been implemented and is_satisfied_by can use it to check if a binary_operator evaluates to true. This is cleaner and reduces code duplication. Additionally cql tests will test the new evalute() implementation. There is one special case with token(). When is_satisfied_by sees a restriction on token it assumes that it's satisfied because it's sure that these token restrictions were used to generate partition ranges. I had to leave this special case in because it's impossible to evaluate(token). Once this is implemented I will remove the special case because it's risky and prone to cause bugs. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-21 20:40:06 +01:00
Jan Ciolek	9c4889ecc3	cql3: expr: handle IS NOT NULL when evaluating binary_operator The code to evaluate binary operators was copied from is_satisfied_by. is_satisfied_by wasn't able to evaluate IS NOT NULL restrictions, so when such restriction is encountered it throws an exception. Implement proper handling for IS NOT NULL binary operators. The switch ensures that all variants of oper_t are handled, otherwise there would be a compilation error. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-21 20:40:00 +01:00
Avi Kivity	bf2e54ff85	Merge 'Move deletion log code to sstable_directory.cc' from Pavel Emelyanov In order to support different storage kinds for sstable files (e.g. -- s3) it's needed to localize all the places that manipulate files on a POSIX filesystem so that custom storage could implement them in its own way. This set moves the deletion log manipulations to the sstable_directory.cc, which already "knows" that it works over a directory. Closes #12020 * github.com:scylladb/scylladb: sstables: Delete log file in replay_pending_delete_log() sstables: Move deletion log manipulations to sstable_directory.cc sstables: Open-code delete_sstables() call sstables: Use fs::path in replay_pending_delete_log() sstables: Indentation fix after previous patch sstables: Coroutinize replay_pending_delete_log sstables: Read pending delete log with one line helper sstables: Dont write pending log with file_writer	2022-11-21 21:22:59 +02:00
Jan Ciolek	b4cc92216b	cql3: expr: make it possible to evaluate binary_operator evaluate() takes an expression and evaluates it to a constant value. It wasn't possible to evalute binary operators before, so it's added. The code is based on is_satisfied_by, which is currently used to check whether a binary operator evaluates to true or false. It looks like is_satisfied_by and evalate() do pretty much the same thing, one could be implemented using the other. In the future they might get merged into a single function. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-21 17:48:23 +01:00
Jan Ciolek	8d81eaa68f	cql3: expr: accept expression as lhs argument to like() like() used to only accept column_value as the lhs to evaluate. Changed it to accept any generic expression. This will allow to evaluate a more diverse set of binary operators. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-21 16:33:18 +01:00
Jan Ciolek	b1a12686dc	cql3: expr: accept expression as lhs in contains_key contains_key() used to only accept column_value as the lhs to evaluate. Changed it to accept any generic expression. This will allow to evaluate a more diverse set of binary operators. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-21 16:33:02 +01:00
Jan Ciolek	79cd9cd956	cql3: expr: accept expression as lhs argument to contains() contains() used to only accept column_value as the lhs to evaluate. Changed it to accept any generic expression. This will allow to evaluate a more diverse set of binary operators. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-21 16:32:44 +01:00
Benny Halevy	57ff3f240f	dht: optimize subtract_ranges Take advantage of the fact that both ranges and ranges_to_subtract are deoverlapped and sorted by to reduce the calculation complexity from quadratic to linear. Fixes #11922 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-21 15:48:28 +02:00
Benny Halevy	8b81635d95	compaction: refactor dht::subtract_ranges out of get_ranges_for_invalidation The algorithm is generic and can be used elsewhere. Add a unit test for the function before it gets optimized in the following patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-21 15:48:26 +02:00
Benny Halevy	7c6f60ae72	compaction_manager: needs_cleanup: get first/last tokens from sstable decorated keys Currently, the function is inefficient in two ways: 1. unnecessary copy of first/last keys to automatic variables 2. redecorating the partition keys with the schema passed to needs_cleanup. We canjust use the tokens from the sstable first/last decorated keys. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-21 15:44:32 +02:00
Pavel Emelyanov	2f9b7931af	sstables: Delete log file in replay_pending_delete_log() It's natural that the replayer cleans up after itself Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-11-21 13:16:22 +03:00
Pavel Emelyanov	bdc47b7717	sstables: Move deletion log manipulations to sstable_directory.cc The deletion log concept uses the fact that files are on a POSIX filesystem. Support for another storage type will have to reimplement this place, so keep the FS-specific code in _directory.cc file. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-11-21 13:16:21 +03:00
Pavel Emelyanov	865c51c6cf	sstables: Open-code delete_sstables() call It's no used by any other code, but to be used it requires the caller to tranform TOC file names by prepending sstable directory to them. Things get shorter and simpler if merging the helper code into the caller. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-11-21 13:15:25 +03:00
Pavel Emelyanov	a61c96a627	sstables: Use fs::path in replay_pending_delete_log() It's called by a code that has fs::path at hand and internally uses helpers that need fs::path too, so no need to convert it back and forth. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-11-21 13:15:25 +03:00
Pavel Emelyanov	f5684bcaf0	sstables: Indentation fix after previous patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-11-21 13:15:25 +03:00
Pavel Emelyanov	85a73ca9c6	sstables: Coroutinize replay_pending_delete_log Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-11-21 13:15:25 +03:00
Pavel Emelyanov	6f3fd94162	sstables: Read pending delete log with one line helper There's one in seastar since recently Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-11-21 13:15:25 +03:00
Pavel Emelyanov	2dedf4d03a	sstables: Dont write pending log with file_writer It's a wrapper over output_stream with offset tracking and the tracking is not needed to generate a log file. As a bonus of switching back we get a stream.write(sstring) sugar. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-11-21 13:15:24 +03:00
Botond Dénes	2d4439a739	Merge 'doc: add a troubleshooting article about the missing configuration files' from Anna Stuchlik Fix https://github.com/scylladb/scylladb/issues/11598 This PR adds the troubleshooting article submitted by @syuu1228 in the deprecated _scylla-docs_ repo, with https://github.com/scylladb/scylla-docs/pull/4152. I copied and reorganized the content and rewritten it a little according to the RST guidelines so that the page renders correctly. @syuu1228 Could you review this PR to make sure that my changes didn't distort the original meaning? Closes #11626 * github.com:scylladb/scylladb: doc: apply the feedback to improve clarity doc: add the link to the new Troubleshooting section and replace Scylla with ScyllaDB doc: add the new page to the toctree doc: add a troubleshooting article about the missing configuration files	2022-11-21 12:02:31 +02:00
Kamil Braun	135eb4a041	test.py: prepare for adding extra config from test when creating servers We will use this for replace operations to pass the IP of replaced node.	2022-11-21 10:57:03 +01:00
Kamil Braun	ac91e9d8be	test/pylib: manager_client: convert `add_server` to use `put_json` We shall soon pass some JSON data into these requests.	2022-11-21 10:57:03 +01:00
Kamil Braun	82eb9af80d	test/pylib: rest_client: allow returning JSON data from `put_json` We'll use `put_json` for requests which want to pass JSON data into the call and also return JSON.	2022-11-21 10:57:03 +01:00
Kamil Braun	4fef2d099b	test/pylib: scylla_cluster: don't import from manager_client There's a logical dependency from `manager_client` to `scylla_cluster` (`ManagerClient` defined in `manager_client` talks to `ScyllaClusterManager` defined in `scylla_cluster` over RPC). There is no such dependency in the other way. Do not introduce it accidentally. We can import these types from the `internal_types` module.	2022-11-21 10:57:03 +01:00
Nadav Har'El	757d2a4c02	test/alternator: un-xfail a test which passes on modern Python We had an xfailing test that reproduced a case where Alternator tried to report an error when the request was too long, but the boto library didn't see this error and threw a "Broken Pipe" error instead. It turns out that this wasn't a Scylla bug but rather a bug in urllib3, which overzealously reported a "Broken Pipe" instead of trying to read the server's response. It turns out this issue was already fixed in https://github.com/urllib3/urllib3/pull/1524 and now, on modern installations, the test that used to fail now passes and reports "XPASS". So in this patch we remove the "xfail" tag, and skip the test if running an old version of urllib3. Fixes #8195 Closes #12038	2022-11-21 08:10:10 +02:00
Botond Dénes	ffc3697f2f	Merge 'storage_service api: handle dropped tables' from Benny Halevy Gracefully skip tables that were removed in the background. Fixes #12007 Closes #12013 * github.com:scylladb/scylladb: api: storage_service: fixup indentation api: storage_service: add run_on_existing_tables api: storage_service: add parse_table_infos api: storage_service: log errors from compaction related handlers api: storage_service: coroutinize compaction related handlers	2022-11-21 07:56:27 +02:00
Avi Kivity	994603171b	Merge 'Add validator to the mutation compactor' from Botond Dénes Fragment reordering and fragment dropping bugs have been plaguing us since forever. To fight them we added a validator to the sstable write path to prevent really messed up sstables from being written. This series adds validation to the mutation compactor. This will cover reads and compaction among others, hopefully ridding us of such bugs on the read path too. This series fixes some benign looking issues found by unit tests after the validator was added -- although how benign a producer emitting two partition-ends depends entirely on how the consumer reacts to it, so no such bug is actually benign. Fixes: https://github.com/scylladb/scylladb/issues/11174 Closes #11532 * github.com:scylladb/scylladb: mutation_compactor: add validator mutation_fragment_stream_validator: add a 'none' validation level test/boost/mutation_query_test: test_partition_limit: sort input data querier: consume_page(): use partition_start as the sentinel value treewide: use ::for_partition_end() instead of ::end_of_partition_tag_t{} treewide: use ::for_partition_start() instead of ::partition_start_tag_t{} position_in_partition: add for_partition_{start,end}()	2022-11-20 20:33:26 +02:00
Avi Kivity	779b01106d	Merge 'cql3: expr: add unit tests for prepare_expression' from Jan Ciołek Adds unit tests for the function `expr::prepare_expression`. Three minor bugs were found by these tests, both fixed in this PR. 1. When preparing a map, the type for tuple constructor was taken from an unprepared tuple, which has `nullptr` as its type. 2. Preparing an empty nonfrozen list or set resulted in `null`, but preparing a map didn't. Fixed this inconsistency. 3. Preparing a `bind_variable` with `nullptr` receiver was allowed. The `bind_variable` ended up with a `nullptr` type, which is incorrect. Changed it to throw an exception, Closes #11941 * github.com:scylladb/scylladb: test preparing expr::usertype_constructor expr_test: test that prepare_expression checks style_type of collection_constructor expr_test: test preparing expr::collection_constructor for map prepare_expr: make preparing nonfrozen empty maps return null prepare_expr: fix a bug in map_prepare_expression expr_test: test preparing expr::collection_constructor for set expr_test: test preparing expr::collection_constructor for list expr_test: test preparing expr::tuple_constructor expr_test: test preparing expr::untyped_constant expr_test_utils: add make_bigint_raw/const expr_test_utils: add make_tinyint_raw/const expr_test: test preparing expr::bind_variable cql3: prepare_expr: forbid preparing bind_variable without a receiver expr_test: test preparing expr::null expr_test: test preparing expr::cast expr_test_utils: add make_receiver expr_test_utils: add make_smallint_raw/const expr_test: test preparing expr::token expr_test: test preparing expr::subscript expr_test: test preparing expr::column_value expr_test: test preparing expr::unresolved_identifier expr_test_utils: mock data_dictionary::database	2022-11-20 20:03:54 +02:00
Nadav Har'El	2ba8b8d625	test/cql-pytest: remove "xfail" from passing test testIndexOnFrozenCollectionOfUDT We had a test that used to fail because of issue #8745. But this issue was alread fixed, and we forgot to remove the "xfail" marker. The test now passes, so let's remove the xfail marker. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #12039	2022-11-20 19:54:59 +02:00
Avi Kivity	40f61db120	Merge 'docs: describe the Raft upgrade and recovery procedures' from Kamil Braun Add new guide for upgrading 5.1 to 5.2. In this new upgrade doc, include additional steps for enabling Raft using the `consistent_cluster_management` flag. Note that we don't have this flag yet but it's planned to replace the experimental flag in 5.2. In the "Raft in ScyllaDB" document, add sections about: - enabling Raft in existing clusters in Scylla 5.2, - verifying that the internal Raft upgrade procedure finishes successfully, - recovering from a stuck Raft upgrade procedure or from a majority loss situation. Fix some problems in the documentation, e.g. it is not possible to enable Raft in an existing cluster in 5.0, but the documentation claimed that it is. Follow-up items: - if we decide for a different name for `consistent_cluster_management`, use that name in the docs instead - update the warnings in Scylla to link to the Raft doc - mention Enterprise versions once we know the numbers - update the appropriate upgrade docs for Enterprise versions once they exist Closes #11910 * github.com:scylladb/scylladb: docs: describe the Raft upgrade and recovery procedures docs: add upgrade guide 5.1 -> 5.2	2022-11-20 19:00:23 +02:00
Avi Kivity	15ee8cfc05	Merge 'reader_concurrency_semaphore: fix waiter/inactive race' from Botond Dénes We recently (in `7fbad8de87`) made sure all admission paths can trigger the eviction of inactive reads. As reader eviction happens in the background, a mechanism was added to make sure only a single eviction fiber was running at any given time. This mechanism however had a preemption point between stopping the fiber and releasing the evict lock. This gave an opportunity for either new waiters or inactive readers to be added, without the fiber acting on it. Since it still held onto the lock, it also prevented from other eviction fibers to start. This could create a situation where the semaphore could admit new reads by evicting inactive ones, but it still has waiters. Since an empty waitlist is also an admission criteria, once one waiter is wrongly added, many more can accumulate. This series fixes this by ensuring the lock is released in the instant the fiber decides there is no more work to do. It also fixes the assert failure on recursive eviction and adds a detection to the inactive/waiter contradiction. Fixes: #11923 Refs: #11770 Closes #12026 * github.com:scylladb/scylladb: reader_concurrency_semaphore: do_wait_admission(): detect admission-waiter anomaly reader_concurrency_semaphore: evict_readers_in_the_background(): eliminate blind spot reader_concurrency_semaphore: do_detach_inactive_read(): do a complete detach	2022-11-20 18:51:34 +02:00
Avi Kivity	895d721d5e	Merge 'scylla-sstable: data-dump improvements' from Botond Dénes This series contains a mixed bag of improvements to `scylla sstable dump-data`. These improvements are mostly aimed at making the json output clearer, getting rid of any ambiguities. Closes #12030 * github.com:scylladb/scylladb: tools/scylla-sstable: traverse sstables in argument order tools/scylla-sstable: dump-data docs: s/clustering_fragments/clustering_elements tools/scylla-sstable: dump-data/json: use Null instead of "<unknown>" tools/scylla-sstable: dump-data/json: use more uniform format for collections tools/scylla-sstable: dump-data/json: make cells easier to parse	2022-11-20 17:02:27 +02:00
Avi Kivity	2f9c53fbe4	Merge 'test/pylib: scylla_cluster: use server ID to name workdir and log file, not IP address' from Kamil Braun Since recently the framework uses a separate set of unique IDs to identify servers, but the log file and workdir is still named using the last part of the IP address. This is confusing: the test logs sometimes don't provide the IP addr (only the ID), and even if they do, the reader of the test log may not know that they need to look at the last part of the IP to find the node's log/workdir. Also using ID will be necessary if we want to reuse IP addresses (e.g. during node replace, or simply not to run out of IP addresses during testing). So use the ID instead to name the workdir and log file. Also, when starting a test case, print the used cluster. This will make it easier to map server IDs to their IP addresses when browsing through the test logs. Closes #12018 * github.com:scylladb/scylladb: test/pylib: manager_client: print used cluster when starting test case test/pylib: scylla_cluster: use server ID to name workdir and log file, not IP address	2022-11-20 16:56:19 +02:00
Avi Kivity	14218d82d6	Update tools/java submodule (serverless) * tools/java caf754f243...874e2d529b (2): > Add Scylla Cloud serverless support > Switch cqlsh to use scylla-driver	2022-11-20 16:41:36 +02:00
Tomasz Grabiec	c8e983b4aa	test: flat_mutation_reader_assertions: Use fatal BOOST_REQUIRE_EQUAL instead of BOOST_CHECK_EQUAL BOOST_CHECK_EQUAL is a weaker form of assertion, it reports an error and will cause the test case to fail but continues. This makes the test harder to debug because there's no obvious way to catch the failure in GDB and the test output is also flooded with things which happen after the failed assertion. Message-Id: <20221119171855.2240225-1-tgrabiec@scylladb.com>	2022-11-20 16:14:26 +02:00
Nadav Har'El	2d2034ea28	Merge 'cql3: don't ignore other restrictions when a multi column restriction is present during filtering' from Jan Ciołek When filtering with multi column restriction present all other restrictions were ignored. So a query like: `SELECT * FROM WHERE pk = 0 AND (ck1, ck2) < (0, 0) AND regular_col = 0 ALLOW FILTERING;` would ignore the restriction `regular_col = 0`. This was caused by a bug in the filtering code: `2779a171fc/cql3/selection/selection.cc (L433-L449)` When multi column restrictions were detected, the code checked if they are satisfied and returned immediately. This is fixed by returning only when these restrictions are not satisfied. When they are satisfied the other restrictions are checked as well to ensure all of them are satisfied. This code was introduced back in 2019, when fixing #3574. Perhaps back then it was impossible to mix multi column and regular columns and this approach was correct. Fixes: #6200 Fixes: #12014 Closes #12031 * github.com:scylladb/scylladb: cql-pytest: add a reproducer for #12014, verify that filtering multi column and regular restrictions works boost/restrictions-test: uncomment part of the test that passes now cql-pytest: enable test for filtering combined multi column and regular column restrictions cql3: don't ignore other restrictions when a multi column restriction is present during filtering	2022-11-20 11:50:38 +02:00
Benny Halevy	ec5707a4a8	api: storage_service: fixup indentation	2022-11-20 09:14:45 +02:00
Benny Halevy	cc63719782	api: storage_service: add run_on_existing_tables Gracefully skip tables that were removed in the background. Fixes #12007 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-20 09:14:29 +02:00
Benny Halevy	9ef9b9d1d9	api: storage_service: add parse_table_infos The table UUIDs are the same on all shards so we might as well get them on shard 0 (as we already do) and reuse them on other shards. It is more efficient and accurate to lookup the table eventually on the shard using its uuid rather than its name. If the table was dropped and recreated using the same name in the background, the new table will have a new uuid and do the api function does not apply to it anymore. A following change will handle the no_such_column_family cases. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-20 09:14:21 +02:00
Benny Halevy	9b4a9b2772	api: storage_service: log errors from compaction related handlers Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-20 09:03:25 +02:00
Benny Halevy	a47f96bc05	api: storage_service: coroutinize compaction related handlers Before we improve parsing tables lists and handling of no_such_column_family errors. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-20 09:03:25 +02:00
Jan Ciolek	286f182a8c	cql-pytest: add a reproducer for #12014 , verify that filtering multi column and regular restrictions works In issue #12014 a user has encountered an instance of #6200. When filtering a WHERE clause which contained both multi-column and regular restrictions, the regular restrictions were ignored. Add a test which reproduces the issue using a reproducer provided by the user. This problem is tested in another similar test, but this one reproduces the issue in the exact way it was found by the user. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-18 15:27:42 +01:00
Jan Ciolek	63fb2612c3	boost/restrictions-test: uncomment part of the test that passes now A part of the test was commented out due to #6200. Now #6200 has been fixed and it can be uncommented. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-18 15:14:32 +01:00
Jan Ciolek	99e1032e34	cql-pytest: enable test for filtering combined multi column and regular column restrictions The test test_multi_column_restrictions_and_filtering was marked as xfail, because issue #6200 wasn't fixed. Now that filtering multi column and other restrictions together has been fixed the test passes. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-18 15:14:32 +01:00
Jan Ciolek	b974d4adfb	cql3: don't ignore other restrictions when a multi column restriction is present during filtering When filtering with multi column restriction present all other restrictions were ignored. So a query like: `SELECT * FROM WHERE pk = 0 AND (ck1, ck2) < (0, 0) AND regular_col = 0 ALLOW FILTERING;` would ignore the restriction `regular_col = 0`. This was caused by a bug in the filtering code: `2779a171fc/cql3/selection/selection.cc (L433-L449)` When multi column restrictions were detected, the code checked if they are satisfied and returned immediately. This is fixed by returning only when these restrictions are not satisfied. When they are satisfied the other restrictions are checked as well to ensure all of them are satisfied. This code was introduced back in 2019, when fixing #3574. Perhaps back then it was impossible to mix multi column and regular columns and this approach was correct. Fixes: #6200 Fixes: #12014 Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-18 15:14:16 +01:00
Botond Dénes	30597f17ed	tools/scylla-sstable: traverse sstables in argument order In the order the user passed them on the command-line.	2022-11-18 15:58:37 +02:00
Botond Dénes	e337b25aa9	tools/scylla-sstable: dump-data docs: s/clustering_fragments/clustering_elements The usage of clustering_fragments is a typo, the output contains clustering_elements.	2022-11-18 15:58:36 +02:00
Botond Dénes	c39408b394	tools/scylla-sstable: dump-data/json: use Null instead of "<unknown>" The currently used "<unknown>" marker for invalid values/types is undistinguishable from a normal value in some cases. Use the much more distinct and unique json Null instead.	2022-11-18 15:58:36 +02:00
Botond Dénes	1dfceb5716	tools/scylla-sstable: dump-data/json: use more uniform format for collections Instead of trying to be clever and switching the output on the type of collection, use the same format always: a list of objects, where the object has a key and value attribute, containing to the respective collection item key and values. This makes processing much easier for machines (and humans too since the previous system wasn't working well).	2022-11-18 15:58:36 +02:00
Botond Dénes	f89acc8df7	tools/scylla-sstable: dump-data/json: make cells easier to parse There are several slightly different cell types in scylla: regular cells, collection cells (frozen and non-frozen) and counter cells (update and shards). In C++ code the type of the cell is always available for code wishing to make out exactly what kind of cell a cell is. In the JSON output of the dump-data this is currently really hard to do as there is not enough information to disambiguate all the different cell types. We wish to make the JSON output self-sufficient so in this patch we introduce a "type" field which contains one of: * regular * counter-update * counter-shards * frozen-collection * collection Furthermore, we bring the different types closer by also printing the counter shards under the 'value' key, not under the 'shards' key as before. The separate 'shards' is no longer needed to disambiguate. The documentation and the write operation is also updated to reflect the changes.	2022-11-18 15:58:36 +02:00
Petr Gusev	41629e97de	test.py: handle --markers parameter Some tests may take longer than a few seconds to run. We want to mark such tests in some way, so that we can run them selectively. This patch proposes to use pytest markers for this. The markers from the test.py command line are passed to pytest as is via the -m parameter. By default, the marker filter is not applied and all tests will be run without exception. To exclude e.g. slow tests you can write --markers 'not slow'. The --markers parameter is currently only supported by Python tests, other tests ignore it. We intend to support this parameter for other types of tests in the future. Another possible improvement is not to run suites for which all tests have been filtered out by markers. The markers are currently handled by pytest, which means that the logic in test.py (e.g., running a scylla test cluster) will be run for such suites. Closes #11713	2022-11-18 12:36:20 +01:00
Avi Kivity	7da12c64bc	Revert "Revert "Merge 'cql3: select_statement: coroutinize indexed_table_select_statement::do_execute_base_query()' from Avi Kivity"" This reverts commit `22f13e7ca3`, and reinstates commit `df8e1da8b2` ("Merge 'cql3: select_statement: coroutinize indexed_table_select_statement::do_execute_base_query()' from Avi Kivity"). The original commit was reverted due to failures in debug mode on aarch64, but after commit `224a2877b9` ("build: disable -Og in debug mode to avoid coroutine asan breakage"), it works again. Closes #12021	2022-11-18 12:44:00 +02:00
Kamil Braun	d7649a86c4	Merge 'Build up to support of dynamic IP address changes in Raft' from Konstantin Osipov We plan to stop storing IP addresses in Raft configuration, and instead use the information disseminated through gossip to locate Raft peers. Implement patches that are building up to that: * improve Raft API of configuration change notifications * disseminate raft host id in Gossip * avoid using Raft addresses from Raft configuraiton, and instead consistently use the translation layer between raft server id <-> IP address Closes #11953 * github.com:scylladb/scylladb: raft: persist the initial raft address map raft: (upgrade) do not use IP addresses from Raft config raft: (and gossip) begin gossiping raft server ids raft: change the API of conf change notifications	2022-11-18 11:38:19 +01:00
Botond Dénes	437fcdeeda	Merge 'Make use of enum_set in directory lister' from Pavel Emelyanov The lister accepts sort of a filter -- what kind of entries to list, regular, directories or both. It currently uses unordered_set, but enum_set is shorter and better describes the intent. Closes #12017 * github.com:scylladb/scylladb: lister: Make lister::dir_entry_types an enum_set database: Avoid useless local variable	2022-11-18 12:15:26 +02:00
Botond Dénes	b39ca29b3c	reader_concurrency_semaphore: do_wait_admission(): detect admission-waiter anomaly The semaphore should admit readers as soon as it can. So at any point in time there should be either no waiters, or the semaphore shouldn't be able to admit new reads. Otherwise something went wrong. Detect this when queuing up reads and dump the diagnostics if detected. Even though tests should ensure this should never happen, recently we've seen a race between eviction and enqueuing producing such situations. This is very hard to write tests for, so add built-in detection and protection instead. Detecting this is very cheap anyway.	2022-11-18 11:35:47 +02:00
Botond Dénes	ca7014ddb8	reader_concurrency_semaphore: evict_readers_in_the_background(): eliminate blind spot Said method has a protection against concurrent (recursive more like) calls to itself, by setting a flag `_evicting` and returning early if this flag is set. The evicting loop however has at least one preemption point between deciding there is nothing more to evict and resetting said flag. This window provides opporunity for new inactive reads or waiters to be queued without this loop noticing, while denying any other concurrent invocations at that time from reacting too. Eliminate this by using repeat() instead of do_until() and setting `_evicting = false` the moment the loop's run condition becomes false.	2022-11-18 11:35:47 +02:00
Botond Dénes	892f52c683	reader_concurrency_semaphore: do_detach_inactive_read(): do a complete detach Currently this method detaches the inactive read from the handle and notifies the permit, calls the notify handler if any and does some stat bookkeeping. Extend it to do a complete detach: unlink the entry from the inactive reads list and also cancel the ttl timer. After this, all that is left to the caller is to destroy the entry. This will prevent any recursive eviction from causing assertion failure. Although recursive eviction shouldn't happen, it shouldn't trigger an assert.	2022-11-18 11:35:43 +02:00
Pavel Emelyanov	a44ca06906	Merge 'token_metadata: Do not use topology info for is_member check' from Asias He Since commit `a980f94` (token_metadata: impl: keep the set of normal token owners as a member), we have a set, _normal_token_owners, which contains all the nodes in the ring. We can use _normal_token_owners to check if a node is part of the ring directly instead of going through the _toplogy indirectly. Fixes #11935 Closes #11936 * github.com:scylladb/scylladb: token_metadata: Rename is_member to is_normal_token_owner token_metadata: Add docs for is_member token_metadata: Do not use topology info for is_member check token_metadata: Check node is part of the topology instead of the ring	2022-11-18 11:54:07 +03:00
Asias He	4571fcf9e7	token_metadata: Rename is_member to is_normal_token_owner The name is_normal_token_owner is more clear than is_member. The is_normal_token_owner reflects what it really checks.	2022-11-18 09:29:20 +08:00
Asias He	965097cde5	token_metadata: Add docs for is_member Make it clear, is_member checks if a node is part of the token ring and checks nothing else.	2022-11-18 09:28:56 +08:00
Asias He	a495b71858	token_metadata: Do not use topology info for is_member check Since commit `a980f94` (token_metadata: impl: keep the set of normal token owners as a member), we have a set, _normal_token_owners, which contains all the nodes in the ring. We can use _normal_token_owners to check if a node is part of the ring directly instead of going through the _toplogy indirectly. Fixes #11935	2022-11-18 09:28:56 +08:00
Asias He	f2ca790883	token_metadata: Check node is part of the topology instead of the ring update_normal_tokens is the way to add a new node into the ring. We should not require a new node to already be in the ring to be able to add it to the ring. The current code works accidentally because is_member is checking if a node is in the topology We should use _topology.has_endpoint to check if a node is part of the topology explicitly.	2022-11-18 09:28:56 +08:00
Jan Ciolek	77d68153f1	test preparing expr::usertype_constructor Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-17 20:41:10 +01:00
Jan Ciolek	eb92fb4289	expr_test: test that prepare_expression checks style_type of collection_constructor Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-17 20:41:10 +01:00
Jan Ciolek	77c63a6b92	expr_test: test preparing expr::collection_constructor for map Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-17 20:41:09 +01:00
Jan Ciolek	db67ade778	prepare_expr: make preparing nonfrozen empty maps return null In Scylla and Cassandra inserting an empty collection that is not frozen, is interpreted as inserting a null value. list_prepare_expression and set_prepare_expression have an if which handles this behavior, but there wasn't one in map_prepare_expression. As a result preparing empty list or set would result in null, but preparing an empty map wouldn't. This is inconsistent, it's better to return null in all cases of empty nonfrozen collections. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-17 20:41:09 +01:00
Jan Ciolek	da71f9b50b	prepare_expr: fix a bug in map_prepare_expression map_prepare_expression takes a collection_constructor of unprepared items and prepares it. Elements of a map collection_constructor are tuples (key and value). map_prepare_expression creates a prepared collection_constructor by preparing each tuple and adding it to the result. During this preparation it needs to set the type of the tuple. There was a bug here - it took the type from unprepared tuple_constructor and assigned it to the prepared one. An unprepared tuple_constructor doesn't have a type so it ended up assigning nullptr. Instead of that it should create a tuple_type_impl instance by looking at the types of map key and values, and use this tuple_type_impl as the type of the prepared tuples. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-17 20:35:04 +01:00
Jan Ciolek	a656fdfe9a	expr_test: test preparing expr::collection_constructor for set Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-17 20:22:37 +01:00
Jan Ciolek	76f587cfe7	expr_test: test preparing expr::collection_constructor for list Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-17 20:22:37 +01:00
Jan Ciolek	44b55e6caf	expr_test: test preparing expr::tuple_constructor Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-17 20:22:37 +01:00
Jan Ciolek	265100a638	expr_test: test preparing expr::untyped_constant Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-17 20:22:37 +01:00
Jan Ciolek	f6b9100cd2	expr_test_utils: add make_bigint_raw/const Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-17 20:22:37 +01:00
Jan Ciolek	f9ff131f86	expr_test_utils: add make_tinyint_raw/const Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-17 20:22:36 +01:00
Jan Ciolek	76b6161386	expr_test: test preparing expr::bind_variable Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-17 20:22:36 +01:00
Jan Ciolek	4882724066	cql3: prepare_expr: forbid preparing bind_variable without a receiver prepare_expression treats receiver as an optional argument, it can be set to nullptr and the preparation should still succeed when it's possible to infer the type of an expression. preparing a bind_variable requires the receiver to be present, because it doesn't contain any information about the type of the bound value. Added a check that the receiver is present. Allowing to prepare a bind_variable without the receiver present was a bug. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-17 20:22:36 +01:00
Avi Kivity	2779a171fc	Merge 'Do not run aborted tasks' from Aleksandra Martyniuk task_manager::task::impl contains an abort source which can be used to check whether it is aborted and an abort method which aborts the task (request_abort on abort_source) and all its descendants recursively. When the start method is called after the task was aborted, then its state is set to failed and the task does not run. Fixes: #11995 Closes #11996 * github.com:scylladb/scylladb: tasks: do not run tasks that are aborted tasks: delete unused variable tasks: add abort_source to task_manager::task::impl	2022-11-17 19:42:46 +02:00
Pavel Emelyanov	a396c27efc	Merge 'message: messaging_service: fix topology_ignored for pending endpoints in get_rpc_client' from Kamil Braun `get_rpc_client` calculates a `topology_ignored` field when creating a client which says whether the client's endpoint had topology information when this client was created. This is later used to check if that client needs to be dropped and replaced with a new client which uses the correct topology information. The `topology_ignored` field was incorrectly calculated as `true` for pending endpoints even though we had topology information for them. This would lead to unnecessary drops of RPC clients later. Fix this. Remove the default parameter for `with_pending` from `topology::has_endpoint` to avoid similar bugs in the future. Apparently this fixes #11780. The verbs used by decommission operation use RPC client index 1 (see `do_get_rpc_client_idx` in message/messaging_service.cc). From local testing with additional logging I found that by the time this client is created (i.e. the first verb in this group is used), we already know the topology. The node is pending at that point - hence the bug would cause us to assume we don't know the topology, leading us to dropping the RPC client later, possibly in the middle of a decommission operation. Fixes: #11780 Closes #11942 * github.com:scylladb/scylladb: message: messaging_service: check for known topology before calling is_same_dc/rack test: reenable test_topology::test_decommission_node_add_column test/pylib: util: configurable period in wait_for message: messaging_service: fix topology_ignored for pending endpoints in get_rpc_client message: messaging_service: topology independent connection settings for GOSSIP verbs	2022-11-17 20:14:32 +03:00
Jan Ciolek	42e01cc67f	expr_test: test preparing expr::null Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-17 17:30:05 +01:00
Jan Ciolek	45b3fca71c	expr_test: test preparing expr::cast Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-17 17:30:05 +01:00
Jan Ciolek	498c9bfa0d	expr_test_utils: add make_receiver Add a convenience function which creates receivers. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-17 17:30:04 +01:00
Jan Ciolek	6873a21fbd	expr_test_utils: add make_smallint_raw/const Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-17 17:30:04 +01:00
Jan Ciolek	488056acb7	expr_test: test preparing expr::token Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-17 17:30:04 +01:00
Jan Ciolek	7958f77a40	expr_test: test preparing expr::subscript Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-17 17:30:04 +01:00
Jan Ciolek	569bd61c6c	expr_test: test preparing expr::column_value Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-17 17:30:04 +01:00
Jan Ciolek	26174e29c6	expr_test: test preparing expr::unresolved_identifier It's interesting that prepare_expression for column identifiers doesn't require a receiver. I hope this won't break validation in the future. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-17 17:30:04 +01:00
Jan Ciolek	c719a923bb	expr_test_utils: mock data_dictionary::database Add a function which creates a mock instance of data_dictionary::database. prepare_expression requires a data_dictionary::database as an argument, so unit tests for it need something to pass there. make_data_dictionary_database can be used to create an instance that is sufficient for tests. Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>	2022-11-17 17:30:00 +01:00
Kamil Braun	8e8c32befe	test/pylib: manager_client: print used cluster when starting test case It will be easier to map server IDs to their IP addresses when browsing through the test logs.	2022-11-17 17:14:23 +01:00
Pavel Emelyanov	bc62ca46d4	lister: Make lister::dir_entry_types an enum_set This type is currently an unordered_set, but only consists of at most two elements. Making it an enum_set renders it into a size_t variable and better describes the intention. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-11-17 19:01:45 +03:00
Pavel Emelyanov	c6021b57a1	database: Avoid useless local variable It's used to run lister::scan_dir() with directory_entry_type::directory only, but for that is copied around on lambda captures. It's simpler just to use the value directly. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-11-17 19:00:49 +03:00
Kamil Braun	b83234d8aa	test/pylib: scylla_cluster: use server ID to name workdir and log file, not IP address Since recently the framework uses a separate set of unique IDs to identify servers, but the log file and workdir is still named using the last part of the IP address. This is confusing: the test logs sometimes don't provide the IP addr (only the ID), and even if they do, the reader of the test log may not know that they need to look at the last part of the IP to find the node's log/workdir. Also using ID will be necessary if we want to reuse IP addresses (e.g. during node replace, or simply not to run out of IP addresses during testing).	2022-11-17 16:55:12 +01:00
Anna Stuchlik	f7f03e38ee	doc: update the link to Enabling Experimental Features	2022-11-17 15:44:46 +01:00
Anna Stuchlik	02cea98f55	doc: remove the note referring to the previous ScyllaDB versions and add the relevant limitation to the paragraph	2022-11-17 15:05:00 +01:00
Anna Stuchlik	ce88c61785	doc: update the links to the Enabling Experimental Features section	2022-11-17 14:59:34 +01:00
Avi Kivity	76be6402ed	Merge 'repair: harden effective replication map' from Benny Halevy As described in #11993 per-shard repair_info instances get the effective_replication_map on their own with no centralized synchronization. This series ensures that the effective replication maps used by repair (and other associated structures like the token metadata and topology) are all in sync with the one used to initiate the repair operation. While at at, the series includes other cleanups in this area in repair and view that are not fixes as the calls happen in synchronous functions that do not yield. Fixes #11993 Closes #11994 * github.com:scylladb/scylladb: repair: pass erm down to get_hosts_participating_in_repair and get_neighbors repair: pass effective_replication_map down to repair_info repair: coroutinize sync_data_using_repair repair: futurize do_repair_start effective_replication_map: add global_effective_replication_map shared_token_metadata: get_lock is const repair: sync_data_using_repair: require to run on shard 0 repair: require all node operations to be called on shard 0 repair: repair_info: keep effective_replication_map repair: do_repair_start: use keyspace erm to get keyspace local ranges repair: do_repair_start: use keyspace erm for get_primary_ranges repair: do_repair_start: use keyspace erm for get_primary_ranges_within_dc repair: do_repair_start: check_in_shutdown first repair: get_db().local() where needed repair: get topology from erm/token_metdata_ptr view: get_view_natural_endpoint: get topology from erm	2022-11-17 13:29:02 +02:00
Konstantin Osipov	262566216b	raft: persist the initial raft address map	2022-11-17 14:26:36 +03:00
Konstantin Osipov	b35af73fdf	raft: (upgrade) do not use IP addresses from Raft config Always use raft address map to obtain the IP addresses of upgrade peers. Right now the map is populated from Raft configuration, so it's an equivalent transformation, but in the future raft address map will be populated from other sources: discovery and gossip, hence the logic of upgrade will change as well. Do not proceed with the upgrade if an address is missing from the map, since it means we failed to contact a raft member.	2022-11-17 14:26:31 +03:00
Pavel Emelyanov	2add9ba292	Merge 'Refactor topology out of token_metadata' from Benny Halevy This series moves the topology code from locator/token_metadata.{cc,hh} out to localtor/topology.{cc,hh} and introduces a shared header file: locator/types.hh contains shared, low level definitions, in anticipation of https://github.com/scylladb/scylladb/pull/11987 While at it, the token_metadata functions are turned into coroutines and topology copy constructor is deleted. The copy functionality is moved into an async `clone_gently` function that allows yielding while copying the topology. Closes #12001 * github.com:scylladb/scylladb: locator: refactor topology out of token_metadata locator: add types.hh topology: delete copy constructor token_metadata: coroutinize clone functions	2022-11-17 13:55:34 +03:00
Aleksandra Martyniuk	7ead1a7857	compaction: request abort only once in compaction_data::stop compaction_manager::task (and thus compaction_data) can be stopped because of many different reasons. Thus, abort can be requested more than once on compaction_data abort source causing a crash. To prevent this before each request_abort() we check whether an abort was requested before. Closes #12004	2022-11-17 12:44:59 +02:00
Benny Halevy	1e2741d2fe	abstract_replication_strategy: recognized_options: return unordered_set An unordered_set is more efficient and there is no need to return an ordered set for this purpose. This change facilitates a follow-up change of adding topology::get_datacenters(), returning an unordered_set of datacenter names. Refs #11987 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #12003	2022-11-17 11:27:05 +02:00
Botond Dénes	e925c41f02	utils/gs/barrett.hh: aarch64: s/brarett/barrett/ Fix a typo introduced by the the recent patch fixing the spelling of Barrett. The patch introduced a typo in the aarch64 version of the code, which wasn't found by promotion, as that only builds on X86_64. Closes #12006	2022-11-17 11:09:59 +02:00
Konstantin Osipov	051dceeaff	raft: (and gossip) begin gossiping raft server ids We plan to use gossip data to educate Raft RPC about IP addresses of raft peers. Add raft server ids to application state, so that when we get a notification about a gossip peer we can identify which raft server id this notification is for, specifically, we can find what IP address stands for this server id, and, whenever the IP address changes, we can update Raft address map with the new address. On the same token, at boot time, we now have to start Gossip before Raft, since Raft won't be able to send any messages without gossip data about IP addresses.	2022-11-17 12:07:31 +03:00
Konstantin Osipov	990c7a209f	raft: change the API of conf change notifications Pass a change diff into the notification callback, rather than add or remove servers one by one, so that if we need to persist the state, we can do it once per configuration change, not for every added or removed server. For now still pass added and removed entries in two separate calls per a single configuration change. This is done mainly to fulfill the library contract that it never sends messages to servers outside the current configuration. The group0 RPC implementation doesn't need the two calls, since it simply marks the removed servers as expired: they are not removed immediately anyway, and messages can still be delivered to them. However, there may be test/mock implementations of RPC which could benefit from this contract, so we decided to keep it.	2022-11-17 12:07:31 +03:00
Benny Halevy	53fdf75cf9	repair: pass erm down to get_hosts_participating_in_repair and get_neighbors Now that it is available in repair_info. Fixes #11993 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-17 08:07:30 +02:00
Benny Halevy	b69be61f41	repair: pass effective_replication_map down to repair_info And make sure the token_metadata ring version is same as the reference one (from the erm on shard 0), when starting the repair on each shard. Refs #11993 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-17 08:07:29 +02:00
Benny Halevy	c47d36b53d	repair: coroutinize sync_data_using_repair Prepare for the next path that will co_await make_global_effective_replication_map. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-17 08:07:04 +02:00
Benny Halevy	58b1c17f5d	repair: futurize do_repair_start Turn it into a coroutine to prepare for the next path that will co_await make_global_effective_replication_map. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-17 08:07:04 +02:00
Benny Halevy	4b9269b7e2	effective_replication_map: add global_effective_replication_map Class to hold a coherent view of a keyspace effective replication map on all shards. To be used in a following patch to pass the sharded keyspace e_r_m:s to repair. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-17 08:07:01 +02:00
Avi Kivity	b8b78959fb	build: switch to packaged libdeflate rather than a submodule Now that our toolchain is based on Fedora 37, we can rely on its libdeflate rather than have to carry our own in a submodule. Frozen toolchain is regenerated. As a side effect clang is updated from 15.0.0 to 15.0.4. Closes #12000	2022-11-17 08:01:00 +02:00
Benny Halevy	2c677e294b	shared_token_metadata: get_lock is const The lock is acquired using an a function that doesn't modify the shared_token_metadata object. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-17 07:58:21 +02:00
Benny Halevy	d6b2124903	repair: sync_data_using_repair: require to run on shard 0 And with that do_sync_data_using_repair can be folded into sync_data_using_repair. This will simplify using the effective_replication_map throughout the operation. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-17 07:58:21 +02:00
Benny Halevy	0c56c75cf8	repair: require all node operations to be called on shard 0 To simplify using of the effective_replication_map / token_metadata_ptr throught the operation. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-17 07:58:21 +02:00
Benny Halevy	64b0756adc	repair: repair_info: keep effective_replication_map Sampled when repair info is constructed. To be used throughout the repair process. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-17 07:58:21 +02:00
Benny Halevy	c7d753cd44	repair: do_repair_start: use keyspace erm to get keyspace local ranges Rather than calling db.get_keyspace_local_ranges that looks up the keyspace and its erm again. We want all the inforamtion derived from the erm to be based on the same source. The function is synchronous so this changes doesn't fix anything, just cleans up the code. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-17 07:58:21 +02:00
Benny Halevy	aaf74776c2	repair: do_repair_start: use keyspace erm for get_primary_ranges Ensure that the primary ranges are in sync with the keyspace erm. The function is synchronous so this change doesn't fix anything, it just cleans up the code. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-17 07:58:21 +02:00
Benny Halevy	9200e6b005	repair: do_repair_start: use keyspace erm for get_primary_ranges_within_dc Ensure the erm and topology are in sync. The function is synchronous so this change doesn't fix anything, just cleans up the code. Fix mistake in comment while at it. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-17 07:57:56 +02:00
Benny Halevy	59dc2567fd	repair: do_repair_start: check_in_shutdown first Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-17 07:56:34 +02:00
Benny Halevy	881eb0df83	repair: get_db().local() where needed In several places we get the sharded database using get_db() and then we only use db.local(). Simplify the code by keeping reference only to the local database upfront. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-17 07:56:34 +02:00
Benny Halevy	c22c4c8527	repair: get topology from erm/token_metdata_ptr We want the topology to be synchronized with the respective effective_replication_map / token_metadata. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-17 07:56:34 +02:00
Benny Halevy	94f2e95a2f	view: get_view_natural_endpoint: get topology from erm Get the topology for the effective replication map rather than from the storage_proxy to ensure its synchronized with the natural endpoints. Since there's no preemption between the two calls currently there is no issue, so this is merely a clean up of the code and not supposed to fix anything. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-17 07:56:34 +02:00
Nadav Har'El	e393639114	test/cql-pytest: reproducer for crash in LWT with null key This patch adds a reproducer for issue #11954: Attempting an "IF NOT EXISTS" (LWT) write with a null key crashes Scylla, instead of producing a simple error message (like happens without the "IF NOT EXISTS" after #7852 was fixed). The test passed on Cassandra, but crashes Scylla. Because of this crash, we can't just mark the test "xfail" and it's temporarily marked "skip" instead. Refs #11954. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11982	2022-11-17 07:31:13 +02:00
Benny Halevy	d0bd305d16	locator: refactor topology out of token_metadata Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-16 21:55:54 +02:00
Benny Halevy	297a4de4e4	locator: add types.hh To export low-level types that are used by oher modules for the locator interfaces. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-16 21:53:05 +02:00
Kamil Braun	0c9cb5c5bf	Merge 'raft: wait for the next tick before retrying' from Gusev Petr When `modify_config` or `add_entry` is forwarded to the leader, it may reach the node at "inappropriate" time and result in an exception. There are two reasons for it - the leader is changing and, in case of `modify_config`, other `modify_config` is currently in progress. In both cases the command is retried, but before this patch there was no delay before retrying, which could led to a tight loop. The patch adds a new exception type `transient_error`. When the client receives it, it is obliged to retry the request after some delay. Previously leader-side exceptions were converted to `not_a_leader`, which is strange, especially for `conf_change_in_progress`. Fixes: #11564 Closes #11769 * github.com:scylladb/scylladb: raft: rafactor: remove duplicate code on retries delays raft: use wait_for_next_tick in read_barrier raft: wait for the next tick before retrying	2022-11-16 18:20:54 +01:00
Aleksandra Martyniuk	4250bd9458	tasks: do not run tasks that are aborted Currently in start() method a task is run even if it was already aborted. When start() is called on an aborted task, its state is set to task_manager::task_state::failed and it doesn't run.	2022-11-16 18:09:41 +01:00
Aleksandra Martyniuk	ebffca7ea5	tasks: delete unused variable	2022-11-16 18:07:57 +01:00
Aleksandra Martyniuk	752edc2205	tasks: add abort_source to task_manager::task::impl task_manager::task can be aborted with impl's abort_source. By default abort request is propagated to all task's descendants.	2022-11-16 18:07:11 +01:00
Avi Kivity	c4f069c6fc	Update seastar submodule * seastar 153223a188...4f4cc00660 (10): > Merge 'Avoid using namespace internal' from Pavel Emelyanov > Merge 'De-futurize IO class update calls' from Pavel Emelyanov > abort_source: subscribe(): remove noexcept qualifier > Merge 'Add Prometheus filtering capabilities by label' from Amnon Heiman > fsqual: stop causing memory leak error on LeakSanitizer > metrics.cc: Do not merge empty histogram > Update tutorial.md > README-DPDK.md: document --cflags option > build: install liburing.pc using stow > core/polymorphic_temporary_buffer: include <seastar/core/memory.hh> Closes #11991	2022-11-16 17:59:33 +02:00
Avi Kivity	3497891cf9	utils: spell "barrett" correctly As P. T. Barnoom famously said, "write what you like but spell my name correctly". Following that, we correct the spelling of Barrett's name in the source tree. Closes #11989	2022-11-16 16:30:38 +02:00
Benny Halevy	0c94ffcc85	topology: delete copy constructor Topology is copied only from token_metadata_impl::clone_only_token_map which copies the token_metadata_impl with yielding to prevent reactor stalls. This should apply to topology as well, so add a clone_gently function for cloning the topology from token_metadata_impl::clone_only_token_map. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-16 15:27:28 +02:00
Benny Halevy	4f4fc7fe22	token_metadata: coroutinize clone functions Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-16 15:27:28 +02:00
Kamil Braun	a83789160d	message: messaging_service: check for known topology before calling is_same_dc/rack `is_same_dc` and `is_same_rack` assume that the peer's topology is known. If it's unknown, `on_internal_error` will be called inside topology. When these functions are used in `get_rpc_client`, they are already protected by an earlier check for knowing the peer's topology (the `has_topology()` lambda). Another use is in `do_start_listen()`, where we create a filter for RPC module to check if it should accept incoming connections. If cross-dc or cross-rack encryption is enabled, we will reject connections attempts to the regular (non-ssl) port from other dcs/rack using `is_same_dc/rack`. However, it might happen that something (other Scylla node or otherwise) tries to contact us on the regular port and we don't know that thing's topology, which would result in `on_internal_error`. But this is not a fatal error; we simply want to reject that connection. So protect these calls as well. Finally, there's `get_preferred_ip` with an unprotected `is_same_dc` call which, for a given peer, may return a different IP from preferred IP cache if the endpoint resides in the same DC. If there is not entry in the preferred IP cache, we return the original (external) IP of the peer. We can do the same if we don't know the peer's topology. It's interesting that we didn't see this particular place blowing up. Perhaps the preferred IP cache is always populated after we know the topology.	2022-11-16 14:01:50 +01:00
Kamil Braun	9b2449d3ea	test: reenable test_topology::test_decommission_node_add_column Also improve the test to increase the probability of reproducing #11780 by injecting sleeps in appropriate places. Without the fix for #11780 from the earlier commit, the test reproduces the issue in roughly half of all runs in dev build on my laptop.	2022-11-16 14:01:50 +01:00
Kamil Braun	0f49813312	test/pylib: util: configurable period in wait_for	2022-11-16 14:01:50 +01:00
Kamil Braun	1bd2471c19	message: messaging_service: fix topology_ignored for pending endpoints in get_rpc_client `get_rpc_client` calculates a `topology_ignored` field when creating a client which says whether the client's endpoint had topology information when topology was created. This is later used to check if that client needs to be dropped and replaced with a new client which uses the correct topology information. The `topology_ignored` field was incorrectly calculated as `true` for pending endpoints even though we had topology information for them. This would lead to unnecessary drops of RPC clients later. Fix this. Remove the default parameter for `with_pending` from `topology::has_endpoint` to avoid similar bugs in the future. Apparently this fixes #11780. The verbs used by decommission operation use RPC client index 1 (see `do_get_rpc_client_idx` in message/messaging_service.cc). From local testing with additional logging I found that by the time this client is created (i.e. the first verb in this group is used), we already know the topology. The node is pending at that point - hence the bug would cause us to assume we don't know the topology, leading us to dropping the RPC client later, possibly in the middle of a decommission operation. Fixes: #11780	2022-11-16 14:01:50 +01:00
Kamil Braun	840be34b5f	message: messaging_service: topology independent connection settings for GOSSIP verbs The gossip verbs are used to learn about topology of other nodes. If inter-dc/rack encryption is enabled, the knowledge of topology is necessary to decide whether it's safe to send unencrypted messages to nodes (i.e., whether the destination lies in the same dc/rack). The logic in `messaging_service::get_rpc_client`, which decided whether a connection must be encrypted, was this (given that encryption is enabled): if the topology of the peer is known, and the peer is in the same dc/rack, don't encrypt. Otherwise encrypt. However, it may happen that node A knows node B's topology, but B doesn't know A's topology. A deduces that B is in the same DC and rack and tries sending B an unencrypted message. As the code currently stands, this would cause B to call `on_internal_error`. This is what I encountered when attempting to fix #11780. To guarantee that it's always possible to deliver gossiper verbs (even if one or both sides don't know each other's topology), and to simplify reasoning about the system in general, choose connection settings that are independent of the topology - for the connection used by gossiper verbs (other connections are still topology-dependent and use complex logic to handle the situation of unknown-and-later-known topology). This connection only contains 'rare' and 'cheap' verbs, so it's not a performance problem to always encrypt it (given that encryption is configured). And this is what already was happening in the past; it was at some point removed during topology knowledge management refactors. We just bring this logic back. Fixes #11992. Inspired by xemul/scylla@45d48f3d02.	2022-11-16 13:58:07 +01:00
Anna Stuchlik	01c9846bb6	doc: add the link to the Enabling Experimental Features section	2022-11-16 13:24:45 +01:00
Anna Stuchlik	f1b2f44aad	doc: move the TTL Alternator feature from the Experimental Features section to the production-ready section	2022-11-16 13:23:07 +01:00
Nadav Har'El	2f2f01b045	materialized views: fix view writes after base table schema change When we write to a materialized view, we need to know some information defined in the base table such as the columns in its schema. We have a "view_info" object that tracks each view and its base. This view_info object has a couple of mutable attributes which are used to lazily-calculate and cache the SELECT statement needed to read from the base table. If the base-table schema ever changes - and the code calls set_base_info() at that point - we need to forget this cached statement. If we don't (as before this patch), the SELECT will use the wrong schema and writes will no longer work. This patch also includes a reproducing test that failed before this patch, and passes afterwords. The test creates a base table with a view that has a non-trivial SELECT (it has a filter on one of the base-regular columns), makes a benign modification to the base table (just a silly addition of a comment), and then tries to write to the view - and before this patch it fails. Fixes #10026 Fixes #11542	2022-11-16 13:58:21 +02:00
Nadav Har'El	7cbb0b98bb	Merge 'doc: document user defined functions (UDFs)' from Anna Stuchlik This PR is V2 of the[ PR created by @psarna.](https://github.com/scylladb/scylladb/pull/11560). I have: - copied the content. - applied the suggestions left by @nyh. - made minor improvements, such as replacing "Scylla" with "ScyllaDB", fixing punctuation, and fixing the RST syntax. Fixes https://github.com/scylladb/scylladb/issues/11378 Closes #11984 * github.com:scylladb/scylladb: doc: label user-defined functions as Experimental doc: restore the note for the Count function (removed by mistatke) doc: document user defined functions (UDFs)	2022-11-16 13:09:47 +02:00
Botond Dénes	cbf9be9715	Merge 'Avoid 0.0.0.0 (and :0) as preferred IP' from Pavel Emelyanov Despite docs discourage from using INADDR_ANY as listen address, this is not disabled in code. Worse -- some snitch drivers may gossip it around as the INTERNAL_IP state. This set prevents this from happening and also adds a sanity check not to use this value if it somehow sneaks in. Closes #11846 * github.com:scylladb/scylladb: messaging_service: Deny putting INADD_ANY as preferred ip messaging_service: Toss preferred ip cache management gossiping_property_file_snitch: Dont gossip INADDR_ANY preferred IP gossiping_property_file_snitch: Make _listen_address optional	2022-11-16 08:30:42 +02:00
Avi Kivity	43d3e91e56	tools: toolchain: prepare: use real bash associative array When we translate from docker/go arch names to the kernel arch names, we use an associative array hack using computed variable names "{$!variable_name}". But it turns out bash has real associative arrays, introduced with "declare -A". Use the to make the code a little clearer. Closes #11985	2022-11-16 08:17:47 +02:00
Botond Dénes	e90d0811d0	Merge 'doc: update ScyllaDB requirements - supported CPUs and AWS i4g instances' from Anna Stuchlik Fix https://github.com/scylladb/scylla-docs/issues/4144 Closes #11226 * github.com:scylladb/scylladb: Update docs/getting-started/system-requirements.rst doc: specify the recommended AWS instance types doc: replace the tables with a generic description of support for Im4gn and Is4gen instances doc: add support for AWS i4g instances doc: extend the list of supported CPUs	2022-11-16 08:15:00 +02:00
Botond Dénes	bd1fcbc38f	Merge 'Introduce reverse vector_deserializer.' from Michał Radwański As indicated in #11816, we'd like to enable deserializing vectors in reverse. The forward deserialization is achieved by reading from an input_stream. The input stream internally is a singly linked list with complicated logic. In order to allow for going through it in reverse, instead when creating the reverse vector initializer, we scan the stream and store substreams to all the places that are a starting point for a next element. The iterator itself just deserializes elements from the remembered substreams, this time in reverse. Fixes #11816 Closes #11956 * github.com:scylladb/scylladb: test/boost/serialization_test.cc: add test for reverse vector deserializer serializer_impl.hh: add reverse vector serializer serializer_impl: remove unneeded generic parameter	2022-11-16 07:37:24 +02:00
Anna Stuchlik	cdb6557f23	doc: label user-defined functions as Experimental	2022-11-15 21:22:01 +01:00
Avi Kivity	d85f731478	build: update toolchain to Fedora 37 with clang 15 'cargo' instantiation now overrides internal git client with cli client due to unbounded memory usage [1]. [1] https://github.com/rust-lang/cargo/issues/10583#issuecomment-1129997984	2022-11-15 16:48:09 +00:00
Anna Stuchlik	1f1d88d04e	doc: restore the note for the Count function (removed by mistatke)	2022-11-15 17:41:22 +01:00
Anna Stuchlik	dbb19f55fb	doc: document user defined functions (UDFs)	2022-11-15 17:33:05 +01:00
Nadav Har'El	e4dba6a830	test/cql-pytest: add test for when MV requires IS NOT NULL As noted in issue #11979, Scylla inconsistently (and unlike Cassandra) requires "IS NOT NULL" one some but not all materialized-view key columns. Specifically, Scylla does not require "IS NOT NULL" on the base's partition key, while Cassandra does. This patch is a test which demonstrates this inconsistency. It currently passes on Cassandra and fails on Scylla, so is marked xfail. Refs #11979 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11980	2022-11-15 14:21:48 +01:00
Asias He	16bd9ec8b1	gossip: Improve get_live_token_owners and get_unreachable_token_owners The get_live_token_owners returns the nodes that are part of the ring and live. The get_unreachable_token_owners returns the nodes that are part of the ring and is not alive. The token_metadata::get_all_endpoints returns nodes that are part of the ring. The patch changes both functions to use the more authoritative source to get the nodes that are part of the ring and call is_alive to check if the node is up or down. So that the correctness does not depend on any derived information. This patch fixes a truncate issue in storage_proxy::truncate_blocking where it calls get_live_token_owners and get_unreachable_token_owners to decide the nodes to talk with for truncate operation. The truncate failed because incorrect nodes were returned. Fixes #10296 Fixes #11928 Closes #11952	2022-11-15 14:21:48 +01:00
Botond Dénes	21489c9f9c	Merge 'doc: add the "Scylladb Enterprise" label to the Enterprise-only features' from Anna Stuchlik This PR is a follow-up to https://github.com/scylladb/scylladb/pull/11918. With this PR: - The "ScyllaDB Enterprise" label is added to all the features that are only available in ScyllaDB Enterprise. - The previous Enterprise-only note is removed (it was included in multiple files as _/rst_include/enterprise-only-note.rst_ - this file is removed as it is no longer used anywhere in the docs). - "Scylla Enterprise" was removed from `versionadded `because now it's clear that the feature was added for Enterprise. Closes #11975 * github.com:scylladb/scylladb: doc: remove the enterprise-only-note.rst file, which was replaced by the ScyllaDB Enterprise label and is not used anymore doc: add the ScyllaDB Enterprise label to the descriptions of Enterprise-only features	2022-11-15 14:21:48 +01:00
Botond Dénes	34f29c8d67	Merge 'Use with_sstable_directory() helper in tests' from Pavel Emelyanov The helper is already widely used, one (last) test case can benefit from using it too Closes #11978 * github.com:scylladb/scylladb: test: Indentation fix after previous patch test: Wse with_sstable_directory() helper	2022-11-15 14:21:48 +01:00
Nadav Har'El	8a4ab87e44	Merge 'utils: crc: generate crc barrett fold tables at compile time' from Avi Kivity We use Barrett tables (misspelled in the code unfortunately) to fold crc computations of multiple buffers into a single crc. This is important because it turns out to be faster to compute crc of three different buffers in parallel rather than compute the crc of one large buffer, since the crc instruction has latency 3. Currently, we have a separate code generation step to compute the fold tables. The step generates a new C++ source files with the tables. But modern C++ allows us to do this computation at compile time, avoiding the code generation step. This simplifies the build. This series does that. There is some complication in that the code uses compiler intrinsics for the computation, and these are not constexpr friendly. So we first introduce constexpr-friendly alternatives and use them. To prove the transformation is correct, I compared the generated code from before the series and from just before the last step (where we use constexpr evaluation but still retain the generated file) and saw no difference in the values. Note that constexpr is not strictly needed - we could have run the code in the global variables' initializer. But that would cause a crash if we run on a pre-clmul machine, and is not as fun. Closes #11957 * github.com:scylladb/scylladb: test: crc: add unit tests for constexpr clmul and barrett fold utils: crc combine table: generate at compile time utils: barrett: inline functions in header utils: crc combine table: generate tables at compile time utils: crc combine table: extract table generation into a constexpr function utils: crc combine table: extract "pow table" code into constexpr function utils: crc combine table: store tables std::arrray rather than C array utils: barrett: make the barrett reduction constexpr friendly utils: clmul: add 64-bit constexpr clmul utils: barrett: extract barrett reduction constants utils: barrett: reorder functions utils: make clmul() constexpr	2022-11-15 14:21:48 +01:00
Petr Gusev	ae3e0e3627	raft: rafactor: remove duplicate code on retries delays Introduce a templated function do_on_leader_with_retries, use it in add_entries/modify_config/read_barrier. The function implements the basic logic of retries with aborts and leader changes handling, adds a delay between iterations to protect against tight loops.	2022-11-15 13:18:53 +04:00
Petr Gusev	15cc1667d0	raft: use wait_for_next_tick in read_barrier Replaced the yield on transport_error with wait_for_next_tick. Added delays for retries, similar to add_entry/modify_config: we postpone the next call attempt if we haven't received new information about the current leader.	2022-11-15 12:31:49 +04:00
Petr Gusev	5e15c3c9bd	raft: wait for the next tick before retrying When modify_config or add_entry is forwarded to the leader, it may reach the node at "inappropriate" time and result in an exception. There are two reasons for it - the leader is changing and, in case of modify_config, other modify_config is currently in progress. In both cases the command is retried, but before this patch there was no delay before retrying, which could led to a tight loop. The patch adds a new exception type transient_error. When the client node receives it, it is obliged to retry the request, possibly after some delay. Previously, leader-side exceptions were converted to not_a_leader exception, which is strange, especially for conf_change_in_progress. We add a delay before retrying in modify_config and add_entry if the client hasn't received any new information about the leader since the last attempt. This can happen if the server responds with a transient_error with an empty leader and the current node has not yet learned the new leader. We neglect an excessive delay if the newly elected leader is the same as the previous one, this supposed to be a rare. Fixes: #11564	2022-11-15 11:49:26 +04:00
Pavel Emelyanov	8dcd9d98d6	test: Indentation fix after previous patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-11-14 20:11:01 +03:00
Pavel Emelyanov	c9128e9791	test: Wse with_sstable_directory() helper It's already used everywhere, but one test case wires up the sstable_directory by hand. Fix it too, but keep in mind, that the caller fn stops the directory early. (indentation is deliberately left broken) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-11-14 20:11:01 +03:00
Michał Radwański	32c60b44c5	test/boost/serialization_test.cc: add test for reverse vector deserializer This test is just a copy-pasted version of forward serializer test.	2022-11-14 16:06:24 +01:00
Michał Radwański	dce67f42f8	serializer_impl.hh: add reverse vector serializer Currently when we want to deserialize mutation in reverse, we unfreeze it and consume from the end. This new reverse vector deserializer goes through input stream remembering substreams that contain a given output range member, and while traversing from the back, deserialize each substream.	2022-11-14 16:06:24 +01:00
Anna Stuchlik	e36bd208cc	doc: remove the enterprise-only-note.rst file, which was replaced by the ScyllaDB Enterprise label and is not used anymore	2022-11-14 15:20:51 +01:00
Anna Stuchlik	36324fe748	doc: add the ScyllaDB Enterprise label to the descriptions of Enterprise-only features	2022-11-14 15:16:51 +01:00
Takuya ASADA	da6c472db9	install.sh: Skip systemd existance check when --without-systemd When --without-systemd specified, install.sh should skip systemd existance check. Fixes #11898 Closes #11934	2022-11-14 14:07:46 +02:00
Benny Halevy	ff5527deb1	topology: copy _sort_by_proximity in copy constructor Fixes #11962 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #11965	2022-11-14 13:59:56 +03:00
Pavel Emelyanov	bd48fdaad5	Merge 'handle_state_normal: do not update topology of removed endpoint' from Benny Halevy Currently, when replacing a node ip, keeping the old host, we might end up with the the old endpoint in system.peers if it is inserted back into the topology by `handle_state_normal` when on_join is called with the old endpoint. Then, later on, on_change sees that: ``` if (get_token_metadata().is_member(endpoint)) { co_await do_update_system_peers_table(endpoint, state, value); ``` As described in #11925. Fixes #11925 Closes #11930 * github.com:scylladb/scylladb: storage_service, system_keyspace: add debugging around system.peers update storage_service: handle_state_normal: update topology and notify_joined endpoint only if not removed	2022-11-14 13:58:28 +03:00
Botond Dénes	8e38551d93	Merge 'Allow each compaction group to have its own compaction backlog tracker' from Raphael "Raph" Carvalho Today, compaction_backlog_tracker is managed in each compaction_strategy implementation. So every compaction strategy is managing its own tracker and providing a reference to it through get_backlog_tracker(). But this prevents each group from having its own tracker, because there's only a single compaction_strategy instance per table. To remove this limitation, compaction_strategy impl will no longer manage trackers but will instead provide an interface for trackers to be created, such that each compaction_group will be allowed to create its own tracker and manage it by itself. Now table's backlog will be the sum of all compaction_group backlogs. The normalization factor is applied on the sum, so we don't have to adjust each individual backlog to any factor. Closes #11762 * github.com:scylladb/scylladb: replica: Allow one compaction_backlog_tracker for each compaction_group compaction: Make compaction_state available for compaction tasks being stopped compaction: Implement move assignment for compaction_backlog_tracker compaction: Fix compaction_backlog_tracker move ctor compaction: Use table_state's backlog tracker in compaction_read_monitor_generator compaction: kill undefined get_unimplemented_backlog_tracker() replica: Refactor table::set_compaction_strategy for multiple groups Fix exception safety when transferring ongoing charges to new backlog tracker replica: move_sstables_from_staging: Use tracker from group owning the SSTable replica: Move table::backlog_tracker_adjust_charges() to compaction_group replica: table::discard_sstables: Use compaction_group's backlog tracker replica: Disable backlog tracker in compaction_group::stop() replica: database_sstable_write_monitor: use compaction_group's backlog tracker replica: Move table::do_add_sstable() to compaction_group test/sstable_compaction_test: Switch to table_state::get_backlog_tracker() compaction/table_state: Introduce get_backlog_tracker()	2022-11-14 07:05:28 +02:00
Avi Kivity	b8cb34b928	test: crc: add unit tests for constexpr clmul and barrett fold Check that the constexpr variants indeed match the runtime variants. I verified manually that exactly one computation in each test is executed at run time (and is compared against a constant).	2022-11-13 16:22:29 +02:00
Avi Kivity	70217b5109	utils: crc combine table: generate at compile time By now the crc combine tables are generated at compile time, but still in a separate code generation step. We now eliminate the code generation step and instead link the global variables directly into the main executable. The global variables have been conveniently named exactly as the code generation step names them, so we don't need to touch any users.	2022-11-12 17:26:45 +02:00
Avi Kivity	164e991181	utils: barrett: inline functions in header Avoid duplicate definitions if the same header is used from more than one place, at it will soon be.	2022-11-12 17:26:08 +02:00
Avi Kivity	a4f06773da	utils: crc combine table: generate tables at compile time Move the tables into global constinit variables that are generated at compile time. Note the code that creates the generated crc32_combine_table.cc is still called; it transorms compile-time generated tables into a C++ source that contains the same values, as literals. If we generate a diff between gen/utils/gz/crc_combine_table.cc before this series and after this patch, we see the only change in the file is the type of the variable (which changed to std::array), proving our constexpr code is correct.	2022-11-12 17:16:59 +02:00
Avi Kivity	a229fdc41e	utils: crc combine table: extract table generation into a constexpr function Move the code to a constexpr function, so we can later generate the tables at compile time. Note that although the function is constexpr, it is still evaluated at runtime, since the calling function (main()) isn't constexpr itself.	2022-11-12 17:13:52 +02:00
Avi Kivity	d42bec59bb	utils: crc combine table: extract "pow table" code into constexpr function A "pow table" is used to generate the Barrett fold tables. Extract its code into a constexpr function so we can later generate the fold tables at compile time.	2022-11-12 17:11:44 +02:00
Avi Kivity	6e34014b64	utils: crc combine table: store tables std::arrray rather than C array C arrays cannot be returned from functions and therefore aren't suitable for constexpr processing. std::array<> is a regular value and so is constexpr friendly.	2022-11-12 17:09:02 +02:00
Avi Kivity	1e9252f79a	utils: barrett: make the barrett reduction constexpr friendly Dispatch to intrinsics or constexpr based on evaluation context.	2022-11-12 17:04:44 +02:00
Avi Kivity	0bd90b5465	utils: clmul: add 64-bit constexpr clmul This is used when generating the Barrett reduction tables, and also when applying the Barrett reduction at runtime, so we need it to be constexpr friendly.	2022-11-12 17:04:05 +02:00
Avi Kivity	c376c539b8	utils: barrett: extract barrett reduction constants The constants are repeated across x86_64 and aarch64, so extract them into a common definition.	2022-11-12 17:00:17 +02:00
Avi Kivity	2fdf81af7b	utils: barrett: reorder functions Reorder functions in dependency order rather than forward declaring them. This makes them more constexpr-friendly.	2022-11-12 16:52:41 +02:00
Avi Kivity	8aa59a897e	utils: make clmul() constexpr clmul() is a pure function and so should already be constexpr, but it uses intrinsics that aren't defined as constexpr and so the compiler can't really compute it at compile time. Fix by defining a constexpr variant and dispatching based on whether we're being constant-evaluated or not. The implementation is simple, but in any case proof that it is correct will be provided later on.	2022-11-12 16:49:43 +02:00
Raphael S. Carvalho	b88acffd66	replica: Allow one compaction_backlog_tracker for each compaction_group Today, compaction_backlog_tracker is managed in each compaction_strategy implementation. So every compaction strategy is managing its own tracker and providing a reference to it through get_backlog_tracker(). But this prevents each group from having its own tracker, because there's only a single compaction_strategy instance per table. To remove this limitation, compaction_strategy impl will no longer manage trackers but will instead provide an interface for trackers to be created, such that each compaction group will be allowed to have its own tracker, which will be managed by compaction manager. On compaction strategy change, table will update each group with the new tracker, which is created using the previously introduced ompaction_group_sstable_set_updater. Now table's backlog will be the sum of all compaction_group backlogs. The normalization factor is applied on the sum, so we don't have to adjust each individual backlog to any factor. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:22:51 -03:00
Raphael S. Carvalho	d862dd815c	compaction: Make compaction_state available for compaction tasks being stopped compaction_backlog_tracker will be managed by compaction_manager, in the per table state. As compaction tasks can access the tracker throughout its lifetime, remove() can only deregister the state once we're done stopping all tasks which map to that state. remove() extracted the state upfront, then performed the stop, to prevent new tasks from being registered and left behind. But we can avoid the leak of new tasks by only closing the gate, which waits for all tasks (which are stopped a step earlier) and once closed, prevents new tasks from being registered. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:22:51 -03:00
Raphael S. Carvalho	0a152a2670	compaction: Implement move assignment for compaction_backlog_tracker That's needed for std::optional to work on its behalf. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:22:49 -03:00
Raphael S. Carvalho	fe305cefd0	compaction: Fix compaction_backlog_tracker move ctor Luckily it's not used anywhere. Default move ctor was picked but it won't clear _manager of old object, meaning that its destructor will incorrectly deregister the tracker from compaction_backlog_manager. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:17:37 -03:00
Raphael S. Carvalho	8e1e30842d	compaction: Use table_state's backlog tracker in compaction_read_monitor_generator A step closer towards a separate backlog tracker for each compaction group. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:17:37 -03:00
Raphael S. Carvalho	fedafd76eb	compaction: kill undefined get_unimplemented_backlog_tracker() Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:17:37 -03:00
Raphael S. Carvalho	90991bda69	replica: Refactor table::set_compaction_strategy for multiple groups Refactoring the function for it to accomodate multiple compaction groups. To still provide strong exception guarantees, preparation and execution of changes will be separated. Once multiple groups are supported, each group will be prepared first, and the noexcept execution will be done as a last step. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:17:37 -03:00
Raphael S. Carvalho	244efddb22	Fix exception safety when transferring ongoing charges to new backlog tracker When setting a new strategy, the charges of old tracker is transferred to the new one. The problem is that we're not reverting changes if exception is triggered before the new strategy is successfully set. To fix this exception safety issue, let's copy the charges instead of moving them. If exception is triggered, the old tracker is still the one used and remain intact. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:17:37 -03:00
Raphael S. Carvalho	d1e2dbc592	replica: move_sstables_from_staging: Use tracker from group owning the SSTable When moving SSTables from staging directory, we'll conditionally add them to backlog tracker. As each group has its own tracker, a given sstable will be added to the tracker of the group that owns it. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:17:37 -03:00
Raphael S. Carvalho	9031dc3199	replica: Move table::backlog_tracker_adjust_charges() to compaction_group Procedures that call this function happen to be in compaction_group, so let's move it to group. Simplifies the change where the procedure retrieves tracker from the group itself. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:17:36 -03:00
Raphael S. Carvalho	116459b69e	replica: table::discard_sstables: Use compaction_group's backlog tracker Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:17:36 -03:00
Raphael S. Carvalho	b2d8545b15	replica: Disable backlog tracker in compaction_group::stop() As we're moving backlog tracker to compaction group, we need to stop the tracker there too. We're moving it a step earlier in table::stop(), before sstables are cleared, but that's okay because it's still done after the group was deregistered from compaction manager, meaning no compactions are running. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:17:36 -03:00
Raphael S. Carvalho	91b0d772e2	replica: database_sstable_write_monitor: use compaction_group's backlog tracker Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:17:36 -03:00
Raphael S. Carvalho	f37a05b559	replica: Move table::do_add_sstable() to compaction_group All callers of do_add_sstable() live in compaction_group, so it should be moved into compaction_group too. It also makes easier for the function to retrieve the backlog tracker from the group. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:17:36 -03:00
Raphael S. Carvalho	835927a2ad	test/sstable_compaction_test: Switch to table_state::get_backlog_tracker() Important for decoupling backlog tracker from table's compaction strategy. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:17:36 -03:00
Raphael S. Carvalho	1ec0ef18a5	compaction/table_state: Introduce get_backlog_tracker() This interface will be helpful for allowing replica::table, unit tests and sstables::compaction to access the compaction group's tracker which will be managed by the compaction manager, once we complete the decoupling work. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:17:36 -03:00
Nadav Har'El	ff87624fb4	test/cql-pytest: add another regression test for reversed-type bug In commit `544ef2caf3` we fixed a bug where a reveresed clustering-key order caused problems using a secondary index because of incorrect type comparison. That commit also included a regression test for this fix. However, that fix was incomplete, and improved later in commit `c8653d1321`. That later fix was labeled "better safe than sorry", and did not include a test demonstrating any actual bug, so unsurprisingly we never backported that second fix to any older branches. Recently we discovered that missing the second patch does cause real problems, and this patch includes a test which fails when the first patch is in, but the second patch isn't (and passes when both patches are in, and also passes on Cassandra). Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11943	2022-11-11 11:01:22 +02:00
Botond Dénes	302917f63d	mutation_compactor: add validator The mutation compactor is used on most read-paths we have, so adding a validator to it gives us a good coverage, in particular it gives us full coverage of queries and compaction. The validator validates mutation token (and mutation fragment kind) monotonicity as that is quite cheap, while it is enough to catch the most common problems. As we already have a validator on the compaction path (in the sstable writer), the validator is disabled when the mutation compactor is instantiated for compaction. We should probably make this configurable at some point. The addition of this validator should prevent the worst of the fragment reordering bugs to affect reads.	2022-11-11 10:26:05 +02:00
Botond Dénes	5c245b4a5e	mutation_fragment_stream_validator: add a 'none' validation level Which, as its name suggests, makes the validating filter not validate anything at all. This validation level can be used effectively to make it so as if the validator was not there at all.	2022-11-11 09:58:44 +02:00
Botond Dénes	a4b58f5261	test/boost/mutation_query_test: test_partition_limit: sort input data The test's input data is currently out-of-order, violating a fundamental invariant of data always being sorted. This doesn't cause any problems right now, but soon it will. Sort it to avoid it.	2022-11-11 09:58:44 +02:00
Botond Dénes	2c551bb7ce	querier: consume_page(): use partition_start as the sentinel value Said method calls `compact_mutation_state::start_new_page()` which requires the kind of the next fragment in the reader. When there is no fragment (reader is at EOS), we use partition-end. This was a poor choice: if the reader is at EOS, partition-kind was the last fragment kind, if the stream were to continue the next fragment would be a partition-start.	2022-11-11 09:58:18 +02:00
Botond Dénes	0bcfc9d522	treewide: use ::for_partition_end() instead of ::end_of_partition_tag_t{} We just added a convenience static factory method for partition end, change the present users of the clunky constructor+tag to use it instead.	2022-11-11 09:58:18 +02:00
Botond Dénes	f1a039fc2b	treewide: use ::for_partition_start() instead of ::partition_start_tag_t{} We just added a convenience static factory method for partition start, change the present users of the clunky constructor+tag to use it instead.	2022-11-11 09:58:18 +02:00
Botond Dénes	6a002953e9	position_in_partition: add for_partition_{start,end}()	2022-11-11 09:58:18 +02:00
Kamil Braun	4a2ec888d5	Merge 'test.py: use internal id to manage servers' from Alecco Instead of using assigned IP addresses, use a local integer ID for managing servers. IP address can be reused by a different server. While there, get host ID (UUID). This can also be reused with `node replace` so it's not good enough for tracking. Closes #11747 * github.com:scylladb/scylladb: test.py: use internal id to manage servers test.py: rename hostname to ip_addr test.py: get host id test.py: use REST api client in ScyllaCluster test.py: remove unnecessary reference to web app test.py: requests without aiohttp ClientSession	2022-11-10 17:12:16 +01:00
Kamil Braun	1cc68b262e	docs: describe the Raft upgrade and recovery procedures In the 5.1 -> 5.2 upgrade doc, include additional steps for enabling Raft using the `consistent_cluster_management` flag. Note that we don't have this flag yet but it's planned to replace the experimental flag in 5.2. In the "Raft in ScyllaDB" document, add sections about: - enabling Raft in existing clusters in Scylla 5.2, - verifying that the internal Raft upgrade procedure finishes successfully, - recovering from a stuck Raft upgrade procedure or from a majority loss situation. Fix some problems in the documentation, e.g. it is not possible to enable Raft in an existing cluster in 5.0, but the documentation claimed that it is. Follow-up items: - if we decide for a different name for `consistent_cluster_management`, use that name in the docs instead - update the warnings in Scylla to link to the Raft doc - mention Enterprise versions once we know the numbers - update the appropriate upgrade docs for Enterprise versions once they exist	2022-11-10 17:08:57 +01:00
Kamil Braun	3dab07ec11	docs: add upgrade guide 5.1 -> 5.2 It's a copy-paste from the 5.0 -> 5.1 guide with substitutions: s/5.1/5.2, s/5.0/5.1 The metric update guide is not written, I left a TODO. Also I didn't include the guide in docs/upgrade/upgrade-opensource/index.rst, since 5.2 is not released yet. The guide can be accessed by manually following the link: /upgrade/upgrade-opensource/upgrade-guide-from-5.1-to-5.2/	2022-11-10 16:49:14 +01:00
Alejo Sanchez	700054abee	test.py: use internal id to manage servers Instead of using assigned IP addresses, use an internal server id. Define types to distinguish local server id, host ID (UUID), and IP address. This is needed to test servers changing IP address and for node replace (host UUID). Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-11-10 09:14:37 +01:00
Alejo Sanchez	1e38f5478c	test.py: rename hostname to ip_addr The code explicitly manages an IP as string, make it explicit in the variable name. Define its type and test for set in the instance instead of using an empty string as placeholder. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-11-10 09:14:37 +01:00
Alejo Sanchez	f478eb52a3	test.py: get host id When initializing a ScyllaServer, try to get the host id instead of only checking the REST API is up. Use the existing aiohttp session from ScyllaCluster. In case of HTTP error check the status was not an internal error (500+). Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-11-10 09:14:37 +01:00
Alejo Sanchez	78663dda72	test.py: use REST api client in ScyllaCluster Move the REST api client to ScyllaCluster. This will allow the cluster to query its own servers. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-11-10 09:14:37 +01:00
Alejo Sanchez	75ea345611	test.py: remove unnecessary reference to web app The aiohttp.web.Application only needs to be passed, so don't store a reference in ScyllaCluster object. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-11-10 09:14:37 +01:00
Alejo Sanchez	a5316b0c6b	test.py: requests without aiohttp ClientSession Simplify REST helper by doing requests without a session. Reusing an aiohttp.ClientSession causes knock-on effects on `rest_api/test_task_manager` due to handling exceptions outside of an async with block. Requests for cluster management and Scylla REST API don't need session, anyway. Raise HTTPError with status code, text reason, params, and json. In ScyllaCluster.install_and_start() instead of adding one more custom exception, just catch all exceptions as they will be re-raised later. While there avoid code duplication and improve sanity, type checking, and lint score. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2022-11-10 09:14:37 +01:00
Botond Dénes	21bc37603a	Merge 'utils: config_src: add set_value_on_all_shards functions' from Benny Halevy Currently when we set a single value we need to call broadcast_to_all_shards to let observers on all shards get notified of the new value. However, the latter broadcasts all value to all shards so it's terribly inefficient. Instead, add async set_value_on_all_shards functions to broadcast a value to all shards. Use those in system_keyspace for db_config_table virtual table and in task_manager_test to update the task_manager ttl. Refs #7316 Closes #11893 * github.com:scylladb/scylladb: tests: check ttl on different shards utils: config_src: add set_value_on_all_shards functions utils: config_file: add config_source::API	2022-11-10 07:16:39 +02:00
Botond Dénes	3aff59f189	Merge 'staging sstables: filter tokens for view update generation' from Benny Halevy This mini-series introduces dht::tokens_filter and uses it for consuming staging sstable in the view_update_generator. The tokens_filter uses the token ranges owned by the current node, as retrieved by get_keyspace_local_ranges. Refs #9559 Closes #11932 * github.com:scylladb/scylladb: db: view_update_generator: always clean up staging sstables compaction: extract incremental_owned_ranges_checker out to dht	2022-11-10 07:00:51 +02:00
Avi Kivity	9b6ab5db4a	Update seastar submodule * seastar e0dabb361f...153223a188 (8): > build: compile dpdk with -fpie (position independent executable) > Merge 'io_request: remove ctor overloads of io_request and s/io_request/const io_request/' from Kefu Chai > iostream: remove unused function > smp: destroy_smp_service_group: verify smp_service_group id > core/circular_buffer: refactor loop in circular_buffer::erase() > Merge 'Outline reactor::add_task() and sanitize reactor::shuffle() methods' from Pavel Emelyanov > Add NOLINT for cert-err58-cpp > tests: Fix false-positive use-after-free detection Closes #11940	2022-11-09 23:36:50 +02:00
Aleksandra Martyniuk	b0ed4d1f0f	tests: check ttl on different shards Test checking if ttl is properly set is extended to check whether the ttl value is changed on non-zero shard.	2022-11-09 16:58:46 +02:00
Botond Dénes	725e5b119d	Revert "replica: Pick new generation for SSTables being moved from staging dir" This reverts commit `ba6186a47f`. Said commit violates the widely held assumption that sstables generations can be used as sstable identity. One known problem caused this is potential OOO partition emitted when reading from sstables (#11843). We now also have a better fix for #11789 (the bug this commit was meant to fix): `4aa0b16852`. So we can revert without regressions. Fixes: #11843 Closes #11886	2022-11-09 16:35:31 +02:00
Eliran Sinvani	ab7429b77d	cql: Fix crash upon use of the word empty for service level name Wrong access to an uninitialized token instead of the actual generated string caused the parser to crash, this wasn't detected by the ANTLR3 compiler because all the temporary variables defined in the ANTLR3 statements are global in the generated code. This essentialy caused a null dereference. Tests: 1. The fixed issue scenario from github. 2. Unit tests in release mode. Fixes #11774 Signed-off-by: Eliran Sinvani <eliransin@scylladb.com> Message-Id: <20190612133151.20609-1-eliransin@scylladb.com> Closes #11777	2022-11-09 15:58:57 +02:00
Anna Stuchlik	d2e54f7097	Merge branch 'master' into anna-requirements-arm-aws	2022-11-09 14:39:00 +01:00
Anna Stuchlik	8375304d9b	Update docs/getting-started/system-requirements.rst Co-authored-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2022-11-09 14:37:34 +01:00
Benny Halevy	38d8777d42	storage_service, system_keyspace: add debugging around system.peers update Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-09 14:45:47 +02:00
Benny Halevy	5401b6055c	storage_service: handle_state_normal: update topology and notify_joined endpoint only if not removed Currently, when replacing a node ip, keeping the old host, we might end up with the the old endpoint in system.peers if it is inserted back into the topology by `handle_state_normal` when on_join is called with the old endpoint. Then, later on, on_change sees that: ``` if (get_token_metadata().is_member(endpoint)) { co_await do_update_system_peers_table(endpoint, state, value); ``` As described in #11925. Fixes #11925 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-09 14:45:22 +02:00
Benny Halevy	1a183047c0	utils: config_src: add set_value_on_all_shards functions Currently when we set a single value we need to call broadcast_to_all_shards to let observers on all shards get notified of the new value. However, the latter broadcasts all value to all shards so it's terribly inefficient. Instead, add async set_value_on_all_shards functions to broadcast a value to all shards. Use those in system_keyspace for db_config_table virtual table and in task_manager_test to update the task_manager ttl. Refs #7316 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-09 11:55:14 +02:00
Benny Halevy	e83f42ec70	utils: config_file: add config_source::API For task_manager test api. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-09 11:53:20 +02:00
Botond Dénes	94db2123b9	Update tools/java submodule * tools/java 583261fc0e...caf754f243 (1): > build: remove JavaScript snippets in ant build file	2022-11-09 07:59:04 +02:00
Benny Halevy	10f8f13b90	db: view_update_generator: always clean up staging sstables Since they are currently not cleaned up by cleanup compaction filter their tokens, processing only tokens owned by the current node (based on the keyspace replication strategy). Refs #9559 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-09 07:38:22 +02:00
Benny Halevy	fd3e66b0cc	compaction: extract incremental_owned_ranges_checker out to dht It is currently used by cleanup_compaction partition filter. Factor it out so it can be used to filter staging sstables in the next patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-09 07:32:56 +02:00
Gleb Natapov' via ScyllaDB development	2100a8f4ca	service: raft: demote configuration change error to warning since it is retried anyway Message-Id: <Y2ohbFtljmd5MNw0@scylladb.com>	2022-11-09 00:09:39 +01:00
Avi Kivity	04ecf4ee18	Update tools/java submodule (cassandra-stress fails with node down) * tools/java 87672be28e...583261fc0e (1): > cassandra-stress: pass all hosts stright to the driver	2022-11-08 14:58:14 +02:00
Botond Dénes	7f69cccbdf	scylla-gdb.py: $downcast_vptr(): add multiple inheritance support When a class inherits from multiple virtual base classes, pointers to instances of this class via one of its base classes, might point to somewhere into the object, not at its beginning. Therefore, the simple method employed currently by $downcast_vptr() of casting the provided pointer to the type extracted from the vtable name fails. Instead when this situation is detected (detectable by observing that the symbol name of the partial vtable is not to an offset of +16, but larger), $downcast_vptr() will iterate over the base classes, adjusting the pointer with their offsets, hoping to find the true start of the object. In the one instance I tested this with, this method worked well. At the very least, the method will now yield a null pointer when it fails, instead of a badly casted object with corrupt content (which the developer might or might not attribute to the bad cast). Closes #11892	2022-11-08 14:51:26 +02:00
Michał Chojnowski	3e0c7a6e9f	test: sstable_datafile_test: eliminate a use of std::regex to prevent stack overflow This usage of std::regex overflows the seastar::thread stack size (128 KiB), causing memory corruption. Fix that. Closes #11911	2022-11-08 14:41:34 +02:00
Botond Dénes	2037d7f9cd	Merge 'doc: add the "ScyllaDB Enterprise" label to highlight the Enterprise-only features' from Anna Stuchlik This PR adds the "ScyllaDB Enterprise" label to highlight the Enterprise-only features on the following pages: - Encryption at Rest - the label indicates that the entire page is about an Enterprise-only feature. - Compaction - the labels indicate the sections that are Enterprise-only. There are more occurrences across the docs that require a similar update. I'll update them in another PR if this PR is approved. Closes #11918 * github.com:scylladb/scylladb: doc: fix the links to resolve the warnings doc: add the Enterprise label on the Compaction page (to a subheading and on a list of strategies) to replace the info box doc: add the Enterprise label to the Encryption at Rest page (the entire page) to replace the info box	2022-11-08 09:53:48 +02:00
Raphael S. Carvalho	a57724e711	Make off-strategy compaction wait for view building completion Prior to off-strategy compaction, streaming / repair would place staging files into main sstable set, and wait for view building completion before they could be selected for regular compaction. The reason for that is that view building relies on table providing a mutation source without data in staging files. Had regular compaction mixed staging data with non-staging one, table would have a hard time providing the required mutation source. After off-strategy compaction, staging files can be compacted in parallel to view building. If off-strategy completes first, it will place the output into the main sstable set. So a parallel view building (on sstables used for off-strategy) may potentially get a mutation source containing staging data from the off-strategy output. That will mislead view builder as it won't be able to detect changes to data in main directory. To fix it, we'll do what we did before. Filter out staging files from compaction, and trigger the operation only after we're done with view building. We're piggybacking on off-strategy timer for still allowing the off-strategy to only run at the end of the node operation, to reduce the amount of compaction rounds on the data introduced by repair / streaming. Fixes #11882. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #11919	2022-11-08 08:53:58 +02:00
Botond Dénes	243fcb96f0	Update tools/python3 submodule * tools/python3 bf6e892...773070e (1): > create-relocatable-package: harden against missing files	2022-11-08 08:43:30 +02:00
Avi Kivity	46690bcb32	build: harden create-relocatable-package.py against changes in libthread-db.so name create-relocatable-package.py collects shared libraries used by executables for packaging. It also adds libthread-db.so to make debugging possible. However, the name it uses has changed in glibc, so packaging fails in Fedora 37. Switch to the version-agnostic names, libthread-db.so. This happens to be a symlink, so resolve it. Closes #11917	2022-11-08 08:41:22 +02:00
Takuya ASADA	acc408c976	scylla_setup: fix incorrect type definition on --online-discard option --online-discard option defined as string parameter since it doesn't specify "action=", but has default value in boolean (default=True). It breaks "provisioning in a similar environment" since the code supposed boolean value should be "action='store_true'" but it's not. We should change the type of the option to int, and also specify "choices=[0, 1]" just like --io-setup does. Fixes #11700 Closes #11831	2022-11-08 08:40:44 +02:00
Avi Kivity	3d345609d8	config: disable "mc" format sstables for new data "md" format was introduced in 4.3, in `3530e80ce1`, two years ago. Disable the option to create new sstables with the "mc" format. Closes #11265	2022-11-08 08:36:27 +02:00
Anna Stuchlik	0eaafced9d	doc: fix the links to resolve the warnings	2022-11-07 19:15:21 +01:00
Anna Stuchlik	b57e0cfb7c	doc: add the Enterprise label on the Compaction page (to a subheading and on a list of strategies) to replace the info box	2022-11-07 18:54:35 +01:00
Anna Stuchlik	9f3fcb3fa0	doc: add the Enterprise label to the Encryption at Rest page (the entire page) to replace the info box	2022-11-07 18:48:37 +01:00
Tomasz Grabiec	a9063f9582	Merge 'service/raft: failure detector: ping `raft::server_id`s, not `gms::inet_address`es' from Kamil Braun Whenever a Raft configuration change is performed, `raft::server` calls `raft_rpc::add_server`/`raft_rpc::remove_server`. Our `raft_rpc` implementation has a function, `_on_server_update`, passed in the constructor, which it called in `add_server`/`remove_server`; that function would update the set of endpoints detected by the direct failure detector. `_on_server_update` was passed an IP address and that address was added to / removed from the failure detector set (there's another translation layer between the IP addresses and internal failure detector 'endpoint ID's; but we can ignore it for the purposes of this commit). Therefore: the failure detector was pinging a certain set of IP addresses. These IP addresses were updated during Raft configuration changes. To implement the `is_alive(raft::server_id)` function (required by `raft::failure_detector` interface), we would translate the ID using the Raft address map, which is currently also updated during configuration changes, to an IP address, and check if that IP address is alive according to the direct failure detector (which maintained an `_alive_set` of type `unordered_set<gms::inet_address>`). This all works well but it assumes that servers can be identified using IP addresses - it doesn't play well with the fact that servers may change their IP addresses. The only immutable identifier we have for a server is `raft::server_id`. In the future, Raft configurations will not associate IP addresses with Raft servers; instead we will assume that IP addresses can change at any time, and there will be a different mechanism that eventually updates the Raft address map with the latest IP address for each `raft::server_id`. To prepare us for that future, in this commit we no longer operate in terms of IP addresses in the failure detector, but in terms of `raft::server_id`s. Most of the commit is boilerplate, changing `gms::inet_address` to `raft::server_id` and function/variable names. The interesting changes are: - in `is_alive`, we no longer need to translate the `raft::server_id` to an IP address, because now the stored `_alive_set` already contains `raft::server_id`s instead of `gms::inet_address`es. - the `ping` function now takes a `raft::server_id` instead of `gms::inet_address`. To send the ping message, we need to translate this to IP address; we do it by the `raft_address_map` pointer introduced in an earlier commit. Thus, there is still a point where we have to translate between `raft::server_id` and `gms::inet_address`; but observe we now do it at the last possible moment - just before sending the message. If we have no translation, we consider the `ping` to have failed - it's equivalent to a network failure where no route to a given address was found. Closes #11759 * github.com:scylladb/scylladb: direct_failure_detector: get rid of complex `endpoint_id` translations service/raft: ping `raft::server_id`s, not `gms::inet_address`es service/raft: store `raft_address_map` reference in `direct_fd_pinger` gms: gossiper: move `direct_fd_pinger` out to a separate service gms: gossiper: direct_fd_pinger: extract generation number caching to a separate class	2022-11-07 16:42:35 +01:00
Botond Dénes	2b572d94f5	Merge 'doc: improve the documentation landing page ' from Anna Stuchlik This PR introduces the following changes to the documentation landing page: - The " New to ScyllaDB? Start here!" box is added. - The "Connect your application to Scylla" box is removed. - Some wording has been improved. - "Scylla" has been replaced with "ScyllaDB". Closes #11896 * github.com:scylladb/scylladb: Update docs/index.rst doc: replace Scylla with ScyllaDB on the landing page doc: improve the wording on the landing page doc: add the link to the ScyllaDB Basics page to the documentation landing page	2022-11-07 16:18:59 +02:00
Avi Kivity	91f2cd5ac4	test: lib: exception_predicate: use boost::regex instead of std::regex std::regex was observed to overflow stack on aarch64 in debug mode. Use boost::regex until the libstdc++ bug[1] is fixed. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61582 Closes #11888	2022-11-07 14:03:25 +02:00
Kamil Braun	0c7ff0d2cb	docs: a single 5.0 -> 5.1 upgrade guide There were 4 different pages for upgrading Scylla 5.0 to 5.1 (and the same is true for other version pairs, but I digress) for different environments: - "ScyllaDB Image for EC2, GCP, and Azure" - Ubuntu - Debian - RHEL/CentOS THe Ubuntu and Debian pages used a common template: ``` .. include:: /upgrade/_common/upgrade-guide-v5-ubuntu-and-debian-p1.rst .. include:: /upgrade/_common/upgrade-guide-v5-ubuntu-and-debian-p2.rst ``` with different variable substitutions. The "Image" page used a similar template, with some extra content in the middle: ``` .. include:: /upgrade/_common/upgrade-guide-v5-ubuntu-and-debian-p1.rst .. include:: /upgrade/_common/upgrade-image-opensource.rst .. include:: /upgrade/_common/upgrade-guide-v5-ubuntu-and-debian-p2.rst ``` The RHEL/CentOS page used a different template: ``` .. include:: /upgrade/_common/upgrade-guide-v4-rpm.rst ``` This was an unmaintainable mess. Most of the content was "the same" for each of these options. The only content that must actually be different is the part with package installation instructions (e.g. calls to `yum` vs `apt-get`). The rest of the content was logically the same - the differences were mistakes, typos, and updates/fixes to the text that were made in some of these docs but not others. In this commit I prepare a single page that covers the upgrade and rollback procedures for each of these options. The section dependent on the system was implemented using Sphinx Tabs. I also fixed and changed some parts: - In the "Gracefully stop the node" section: Ubuntu/Debian/Images pages had: ```rst .. code:: sh sudo service scylla-server stop ``` RHEL/CentOS pages had: ```rst .. code:: sh .. include:: /rst_include/scylla-commands-stop-index.rst ``` the stop-index file contained this: ```rst .. tabs:: .. group-tab:: Supported OS .. code-block:: shell sudo systemctl stop scylla-server .. group-tab:: Docker .. code-block:: shell docker exec -it some-scylla supervisorctl stop scylla (without stopping some-scylla container) ``` So the RHEL/CentOS version had two tabs: one for Scylla installed directly on the system, one for Scylla running in Docker - which is interesting, because nothing anywhere else in the upgrade documents mentions Docker. Furthermore, the RHEL/CentOS version used `systemctl` while the ubuntu/debian/images version used `service` to stop/start scylla-server. Both work on modern systems. The Docker option is completely out of place - the rest of the upgrade procedure does not mention Docker. So I decided it doesn't make sense to include it. Docker documentation could be added later if we actually decide to write upgrade documentation when using Docker... Between `systemctl` and `service` I went with `service` as it's a bit higher-level. - Similar change for "Start the node" section, and corresponding stop/start sections in the Rollback procedure. - To reuse text for Ubuntu and Debian, when referencing "ScyllaDB deb repo" in the Debian/Ubuntu tabs, I provide two separate links: to Debian and Ubuntu repos. - the link to rollback procedure in the RPM guide (in 'Download and install the new release' section) pointed to rollback procedure from 3.0 to 3.1 guide... Fixed to point to the current page's rollback procedure. - in the rollback procedure steps summary, the RPM version missed the "Restore system tables" step. - in the rollback procedure, the repository links were pointing to the new versions, while they should point to the old versions. There are some other pre-existing problems I noticed that need fixing: - EC2/GCP/Azure option has no corresponding coverage in the rollback section (Download and install the old release) as it has in the upgrade section. There is no guide for rolling back 3rd party and OS packages, only Scylla. I left a TODO in a comment. - the repository links assume certain Debian and Ubuntu versions (Debian 10 and Ubuntu 20), but there are more available options (e.g. Ubuntu 22). Not sure how to deal with this problem. Maybe a separate section with links? Or just a generic link without choice of platform/version? Closes #11891	2022-11-07 14:02:08 +02:00
Avi Kivity	9fa1783892	Merge 'cleanup compaction: flush memtable' from Benny Halevy Flush the memtable before cleaning up the table so not to leave any disowned tokens in the memtable as they might be resurrected if left in the memtable. Fixes #1239 Closes #11902 * github.com:scylladb/scylladb: table: perform_cleanup_compaction: flush memtable table: add perform_cleanup_compaction api: storage_service: add logging for compaction operations et al	2022-11-07 13:18:12 +02:00
Anna Stuchlik	c8455abb71	Update docs/index.rst Co-authored-by: Tzach Livyatan <tzach.livyatan@gmail.com>	2022-11-07 10:25:24 +01:00
AdamStawarz	6bc455ebea	Update tombstones-flush.rst change syntax: nodetool compact <keyspace>.<mytable>; to nodetool compact <keyspace> <mytable>; Closes #11904	2022-11-07 11:19:26 +02:00
Avi Kivity	224a2877b9	build: disable -Og in debug mode to avoid coroutine asan breakage Coroutines and asan don't mix well on aarch64. This was seen in `22f13e7ca3` (" Revert "Merge 'cql3: select_statement: coroutinize indexed_table_select_statement::do_execute_base_query()' from Avi Kivity"") where a routine coroutinization was reverted due to failures on aarch64 debug mode. In clang 15 this is even worse, the existing code starts failing. However, if we disable optimization (-O0 rather than -Og), things begin to work again. In fact we can reinstate the patch reverted above even with clang 12. Fix (or rather workaround) the problem by avoiding -Og on aarch64 debug mode. There's the lingering fear that release mode is miscompiled too, but all the tests pass on clang 15 in release mode so it appears related to asan. Closes #11894	2022-11-07 10:55:13 +02:00
Benny Halevy	eb3a94e2bc	table: perform_cleanup_compaction: flush memtable We don't explicitly cleanup the memtable, while it might hold tokens disowned by the current node. Flush the memtable before performing cleanup compaction to make sure all tokens in the memtable are cleaned up. Note that non-owned ranges are invalidate in the cache in compaction_group::update_main_sstable_list_on_compaction_completion using desc.ranges_for_cache_invalidation. Fixes #1239 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-06 19:41:40 +02:00
Benny Halevy	fc278be6c4	table: add perform_cleanup_compaction Move the integration with compaction_manager from the api layer to the tabel class so it can also make sure the memtable is cleaned up in the next patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-06 19:41:33 +02:00
Benny Halevy	85523c45c0	api: storage_service: add logging for compaction operations et al Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-11-06 19:41:31 +02:00
Petr Gusev	44f48bea0f	raft: test_remove_node_with_concurrent_ddl The test runs remove_node command with background ddl workload. It was written in an attempt to reproduce scylladb#11228 but seems to have value on its own. The if_exists parameter has been added to the add_table and drop_table functions, since the driver could retry the request sent to a removed node, but that request might have already been completed. Function wait_for_host_known waits until the information about the node reaches the destination node. Since we add new nodes at each iteration in main, this can take some time. A number of abort-related options was added SCYLLA_CMDLINE_OPTIONS as it simplifies nailing down problems. Closes #11734	2022-11-04 17:16:35 +01:00
David Garcia	26bc53771c	docs: automatic previews configuration Closes #11591	2022-11-04 15:44:22 +02:00
Kamil Braun	e086521c1a	direct_failure_detector: get rid of complex `endpoint_id` translations The direct failure detector operates on abstract `endpoint_id`s for pinging. The `pigner` interface is responsible for translating these IDs to 'real' addresses. Earlier we used two types of addresses: IP addresses in 'production' code (`gms::gossiper::direct_fd_pinger`) and `raft::server_id`s in test code (in `randomized_nemesis_test`). For each of these use cases we would maintain mappings between `endpoint_id`s and the address type. In recent commits we switched the 'production' code to also operate on Raft server IDs, which are UUIDs underneath. In this commit we switch `endpoint_id`s from `unsigned` type to `utils::UUID`. Because each use case operates in Raft server IDs, we can perform a simple translation: `raft_id.uuid()` to get an `endpoint_id` from a Raft ID, `raft::server_id{ep_id}` to obtain a Raft ID from an `endpoint_id`. We no longer have to maintain complex sharded data structures to store the mappings.	2022-11-04 09:38:08 +01:00
Kamil Braun	bdeef77f20	service/raft: ping `raft::server_id`s, not `gms::inet_address`es Whenever a Raft configuration change is performed, `raft::server` calls `raft_rpc::add_server`/`raft_rpc::remove_server`. Our `raft_rpc` implementation has a function, `_on_server_update`, passed in the constructor, which it called in `add_server`/`remove_server`; that function would update the set of endpoints detected by the direct failure detector. `_on_server_update` was passed an IP address and that address was added to / removed from the failure detector set (there's another translation layer between the IP addresses and internal failure detector 'endpoint ID's; but we can ignore it for the purposes of this commit). Therefore: the failure detector was pinging a certain set of IP addresses. These IP addresses were updated during Raft configuration changes. To implement the `is_alive(raft::server_id)` function (required by `raft::failure_detector` interface), we would translate the ID using the Raft address map, which is currently also updated during configuration changes, to an IP address, and check if that IP address is alive according to the direct failure detector (which maintained an `_alive_set` of type `unordered_set<gms::inet_address>`). This all works well but it assumes that servers can be identified using IP addresses - it doesn't play well with the fact that servers may change their IP addresses. The only immutable identifier we have for a server is `raft::server_id`. In the future, Raft configurations will not associate IP addresses with Raft servers; instead we will assume that IP addresses can change at any time, and there will be a different mechanism that eventually updates the Raft address map with the latest IP address for each `raft::server_id`. To prepare us for that future, in this commit we no longer operate in terms of IP addresses in the failure detector, but in terms of `raft::server_id`s. Most of the commit is boilerplate, changing `gms::inet_address` to `raft::server_id` and function/variable names. The interesting changes are: - in `is_alive`, we no longer need to translate the `raft::server_id` to an IP address, because now the stored `_alive_set` already contains `raft::server_id`s instead of `gms::inet_address`es. - the `ping` function now takes a `raft::server_id` instead of `gms::inet_address`. To send the ping message, we need to translate this to IP address; we do it by the `raft_address_map` pointer introduced in an earlier commit. Thus, there is still a point where we have to translate between `raft::server_id` and `gms::inet_address`; but observe we now do it at the last possible moment - just before sending the message. If we have no translation, we consider the `ping` to have failed - it's equivalent to a network failure where no route to a given address was found.	2022-11-04 09:38:08 +01:00
Kamil Braun	ac70a05c7e	service/raft: store `raft_address_map` reference in `direct_fd_pinger` The pinger will use the map to translate `raft::server_id`s to `gms::inet_address`es when pinging.	2022-11-04 09:38:08 +01:00
Kamil Braun	2c20f2ab9d	gms: gossiper: move `direct_fd_pinger` out to a separate service In later commit `direct_fd_pinger` will operate in terms of `raft::server_id`s. Decouple it from `gossiper` since we don't want to entangle `gossiper` with Raft-specific stuff.	2022-11-04 09:38:08 +01:00
Kamil Braun	e9a4263e14	gms: gossiper: direct_fd_pinger: extract generation number caching to a separate class `gms::gossiper::direct_fd_pinger` serves multiple purposes: one of them is to maintain a mapping between `gms::inet_address`es and `direct_failure_detector::pinger::endpoint_id`s, another is to cache the last known gossiper's generation number to use it for sending gossip echo messages. The latter is the only gossiper-specific thing in this class. We want to move `direct_fd_pinger` utside `gossiper`. To do that, split the gossiper-specific thing -- the generation number management -- to a smaller class, `echo_pinger`. `echo_pinger` is a top-level class (not a nested one like `direct_fd_pinger` was) so we can forward-declare it and pass references to it without including gms/gossiper.hh header.	2022-11-04 09:38:08 +01:00
Avi Kivity	768d77d31b	Update seastar submodule * seastar f32ed00954...e0dabb361f (12): > sstring: define formatter > file: Dont violate API layering > Add compile_commands.json to gitignore > Merge 'Add an allocation failure metric' from Travis Downs > Use const test objects > Ragel chunk parser: compilation err, unused var > build: do not expose Valgrind in SeastarTargets.cmake > defer: mark deferred_* with [[nodiscard]] > Log selected reactor backend during startup > http: mark str with [[maybe_unused]] > Merge 'reactor: open fd without O_NONBLOCK when using io_uring backend' from Kefu Chai > reactor: add accept and connect to io_uring backend Closes #11895	2022-11-04 09:27:56 +04:00
Anna Stuchlik	fb01565a15	doc: replace Scylla with ScyllaDB on the landing page	2022-11-03 17:42:49 +01:00
Anna Stuchlik	7410ab0132	doc: improve the wording on the landing page	2022-11-03 17:38:14 +01:00
Anna Stuchlik	ab5e48261b	doc: add the link to the ScyllaDB Basics page to the documentation landing page	2022-11-03 17:31:03 +01:00
Pavel Emelyanov	efbfcdb97e	Merge 'Replicate `raft_address_map` non-expiring entries to other shards' from Kamil Braun Replicating `raft_address_map` entries is needed for the following use cases: - the direct failure detector - currently it assumes a static mapping of `raft::server_id`s to `gms::inet_address`es, which is obtained on Raft group 0 configuration changes. To handle dynamic mappings we need to modify the failure detector so it pings `raft::server_id`s and obtains the `gms::inet_address` before sending the message from `raft_address_map`. The failure detector is sharded, so we need the mappings to be available on all shards. - in the future we'll have multiple Raft groups running on different shards. To send messages they'll need `raft_address_map`. Initially I tried to replicate all entries - expiring and non-expiring. The implementation turned out to be very complex - we need to handle dropping expired entries and refreshing expiring entries' timestamps across shards, and doing this correctly while accounting for possible races is quite problematic. Eventually I arrived at the conclusion that replicating only non-expiring entries, and furthermore allowing non-expiring entries to be added only on shard 0, is good enough for our use cases: - The direct failure detector is pinging group 0 members only; group 0 members correspond exactly to the non-expiring entries. - Group 0 configuration changes are handled on shard 0, so non-expiring entries are added/removed on shard 0. - When we have multiple Raft groups, we can reuse a single Raft server ID for all Raft servers running on a single node belonging to different groups; they are 'namespaced' by the group IDs. Furthermore, every node has a server that belongs to group 0. Thus for every Raft server in every group, it has a corresponding server in group 0 with the same ID, which has a non-expiring entry in `raft_address_map`, which is replicated to all shards; so every group will be able to deliver its messages. With these assumptions the implementation is short and simple. We can always complicate it in the future if we find that the assumptions are too strong. Closes #11791 * github.com:scylladb/scylladb: test/raft: raft_address_map_test: add replication test service/raft: raft_address_map: replicate non-expiring entries to other shards service/raft: raft_address_map: assert when entry is missing in drop_expired_entries service/raft: turn raft_address_map into a service	2022-11-03 18:34:42 +03:00
Avi Kivity	ca2010144e	test: loading_cache_test: fix use-after-free in test_loading_cache_remove_leaves_no_old_entries_behind We capture `key` by reference, but it is in a another continuation. Capture it by value, and avoid the default capture specification. Found by clang 15 + asan + aarch64. Closes #11884	2022-11-03 17:23:40 +02:00
Avi Kivity	0c3967cf5e	Merge 'scylla-gdb.py: improve scylla-fiber' from Botond Dénes The main theme of this patchset is improving `scylla-fiber`, with some assorted unrelated improvement tagging along. In lieu of explicit support for mapping up continuation chains in memory from seastar (there is one but it uses function calls), scylla fiber uses a quite crude method to do this: it scans task objects for outbound references to other task objects to find waiters tasks and scans inbound references from other tasks to find waited-on tasks. This works well for most objects, but there are some problematic ones: * `seastar::thread_context`: the waited-on task (`seastar::(anonymous namespace)::thread_wake_task`) is allocated on the thread's stack which is not in the object itself. Scylla fiber now scans the stack bottom-up to find this task. * `seastar::smp_message_queue::async_work_item`: the waited on task lives on another shard. Scylla fiber now digs out the remote shard from the work item and continues the search on the remote shard. * `seastar::when_all_state`: the waited on task is a member in the same object tripping loop detection and terminating the search. Seastar fiber now uses the `_continuation` member explicitely to look for the next links. Other minor improvements were also done, like including the shard of the task in the printout. Example demonstrating all the new additions: ``` (gdb) scylla fiber 0x000060002d650200 Stopping because loop is detected: task 0x000061c00385fb60 was seen before. [shard 28] #-13 (task) 0x000061c00385fba0 0x00000000003b5b00 vtable for seastar::internal::when_all_state_component<seastar::future<void> > + 16 [shard 28] #-12 (task) 0x000061c00385fb60 0x0000000000417010 vtable for seastar::internal::when_all_state<seastar::internal::identity_futures_tuple<seastar::future<void>, seastar::future<void> >, seastar::future<void>, seastar::future<void> > + 16 [shard 28] #-11 (task) 0x000061c009f16420 0x0000000000419830 _ZTVN7seastar12continuationINS_8internal22promise_base_with_typeIvEEZNS_6futureISt5tupleIJNS4_IvEES6_EEE14discard_resultEvEUlDpOT_E_ZNS8_14then_impl_nrvoISC_S6_EET0_OT_EUlOS3_RSC_ONS_12future_stateIS7_EEE_S7_EE + 16 [shard 28] #-10 (task) 0x000061c0098e9e00 0x0000000000447440 vtable for seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::smp_message_queue::async_work_item<seastar::sharded<cql_transport::cql_server>::stop()::{lambda(unsigned int)#1}::operator()(unsigned int)::{lambda()#1}>::run_and_dispose()::{lambda(auto:1)#1}, seastar::future<void>::then_wrapped_nrvo<void, seastar::smp_message_queue::async_work_item<seastar::sharded<cql_transport::cql_server>::stop()::{lambda(unsigned int)#1}::operator()(unsigned int)::{lambda()#1}> >(seastar::smp_message_queue::async_work_item<seastar::sharded<cql_transport::cql_server>::stop()::{lambda(unsigned int)#1}::operator()(unsigned int)::{lambda()#1}>&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, seastar::smp_message_queue::async_work_item<seastar::sharded<cql_transport::cql_server>::stop()::{lambda(unsigned int)#1}::operator()(unsigned int)::{lambda()#1}>&, seastar::future_state<seastar::internal::monostate>&&)#1}, void> + 16 [shard 0] #-9 (task) 0x000060000858dcd0 0x0000000000449d68 vtable for seastar::smp_message_queue::async_work_item<seastar::sharded<cql_transport::cql_server>::stop()::{lambda(unsigned int)#1}::operator()(unsigned int)::{lambda()#1}> + 16 [shard 0] #-8 (task) 0x0000600050c39f60 0x00000000007abe98 vtable for seastar::parallel_for_each_state + 16 [shard 0] #-7 (task) 0x000060000a59c1c0 0x0000000000449f60 vtable for seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::sharded<cql_transport::cql_server>::stop()::{lambda(seastar::future<void>)#2}, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, {lambda(seastar::future<void>)#2}>({lambda(seastar::future<void>)#2}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, {lambda(seastar::future<void>)#2}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void> + 16 [shard 0] #-6 (task) 0x000060000a59c400 0x0000000000449ea0 vtable for seastar::continuation<seastar::internal::promise_base_with_type<void>, cql_transport::controller::do_stop_server()::{lambda(std::unique_ptr<seastar::sharded<cql_transport::cql_server>, std::default_delete<seastar::sharded<cql_transport::cql_server> > >&)#1}::operator()(std::unique_ptr<seastar::sharded<cql_transport::cql_server>, std::default_delete<seastar::sharded<cql_transport::cql_server> > >&) const::{lambda()#1}::operator()() const::{lambda()#1}, seastar::future<void>::then_impl_nrvo<{lambda()#1}, {lambda()#1}>({lambda()#1}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, {lambda()#1}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void> + 16 [shard 0] #-5 (task) 0x0000600009d86cc0 0x0000000000449c00 vtable for seastar::internal::do_with_state<std::tuple<std::unique_ptr<seastar::sharded<cql_transport::cql_server>, std::default_delete<seastar::sharded<cql_transport::cql_server> > > >, seastar::future<void> > + 16 [shard 0] #-4 (task) 0x00006000019ffe20 0x00000000007ab368 vtable for seastar::(anonymous namespace)::thread_wake_task + 16 [shard 0] #-3 (task) 0x00006000085ad080 0x0000000000809e18 vtable for seastar::thread_context + 16 [shard 0] #-2 (task) 0x0000600009c04100 0x00000000006067f8 _ZTVN7seastar12continuationINS_8internal22promise_base_with_typeIvEEZNS_5asyncIZZN7service15storage_service5drainEvENKUlRS6_E_clES7_EUlvE_JEEENS_8futurizeINSt9result_ofIFNSt5decayIT_E4typeEDpNSC_IT0_E4typeEEE4typeEE4typeENS_17thread_attributesEOSD_DpOSG_EUlvE0_ZNS_6futureIvE14then_impl_nrvoIST_SV_EET0_SQ_EUlOS3_RST_ONS_12future_stateINS1_9monostateEEEE_vEE + 16 [shard 0] #-1 (task) 0x000060000a59c080 0x0000000000606ae8 _ZTVN7seastar12continuationINS_8internal22promise_base_with_typeIvEENS_6futureIvE12finally_bodyIZNS_5asyncIZZN7service15storage_service5drainEvENKUlRS9_E_clESA_EUlvE_JEEENS_8futurizeINSt9result_ofIFNSt5decayIT_E4typeEDpNSF_IT0_E4typeEEE4typeEE4typeENS_17thread_attributesEOSG_DpOSJ_EUlvE1_Lb0EEEZNS5_17then_wrapped_nrvoIS5_SX_EENSD_ISG_E4typeEOT0_EUlOS3_RSX_ONS_12future_stateINS1_9monostateEEEE_vEE + 16 [shard 0] #0 (task) 0x000060002d650200 0x0000000000606378 vtable for seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::future<void>::finally_body<service::storage_service::run_with_api_lock<service::storage_service::drain()::{lambda(service::storage_service&)#1}>(seastar::basic_sstring<char, unsigned int, 15u, true>, service::storage_service::drain()::{lambda(service::storage_service&)#1}&&)::{lambda(service::storage_service&)#1}::operator()(service::storage_service&)::{lambda()#1}, false>, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, {lambda(service::storage_service&)#1}>({lambda(service::storage_service&)#1}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, {lambda(service::storage_service&)#1}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void> + 16 [shard 0] #1 (task) 0x000060000bc40540 0x0000000000606d48 _ZTVN7seastar12continuationINS_8internal22promise_base_with_typeIvEENS_6futureIvE12finally_bodyIZNS_3smp9submit_toIZNS_7shardedIN7service15storage_serviceEE9invoke_onIZNSB_17run_with_api_lockIZNSB_5drainEvEUlRSB_E_EEDaNS_13basic_sstringIcjLj15ELb1EEEOT_EUlSF_E_JES5_EET1_jNS_21smp_submit_to_optionsESK_DpOT0_EUlvE_EENS_8futurizeINSt9result_ofIFSJ_vEE4typeEE4typeEjSN_SK_EUlvE_Lb0EEEZNS5_17then_wrapped_nrvoIS5_S10_EENSS_ISJ_E4typeEOT0_EUlOS3_RS10_ONS_12future_stateINS1_9monostateEEEE_vEE + 16 [shard 0] #2 (task) 0x000060000332afc0 0x00000000006cb1c8 vtable for seastar::continuation<seastar::internal::promise_base_with_type<seastar::json::json_return_type>, api::set_storage_service(api::http_context&, seastar::httpd::routes&)::{lambda(std::unique_ptr<seastar::httpd::request, std::default_delete<seastar::httpd::request> >)#38}::operator()(std::unique_ptr<seastar::httpd::request, std::default_delete<seastar::httpd::request> >) const::{lambda()#1}, seastar::future<void>::then_impl_nrvo<{lambda(std::unique_ptr<seastar::httpd::request, std::default_delete<seastar::httpd::request> >)#38}, {lambda()#1}<seastar::json::json_return_type> >({lambda(std::unique_ptr<seastar::httpd::request, std::default_delete<seastar::httpd::request> >)#38}&&)::{lambda(seastar::internal::promise_base_with_type<seastar::json::json_return_type>&&, {lambda(std::unique_ptr<seastar::httpd::request, std::default_delete<seastar::httpd::request> >)#38}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void> + 16 [shard 0] #3 (task) 0x000060000a1af700 0x0000000000812208 vtable for seastar::continuation<seastar::internal::promise_base_with_type<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >, seastar::httpd::function_handler::function_handler(std::function<seastar::future<seastar::json::json_return_type> (std::unique_ptr<seastar::httpd::request, std::default_delete<seastar::httpd::request> >)> const&)::{lambda(std::unique_ptr<seastar::httpd::request, std::default_delete<seastar::httpd::request> >, std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> >)#1}::operator()(std::unique_ptr<seastar::httpd::request, std::default_delete<seastar::httpd::request> >, std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> >) const::{lambda(seastar::json::json_return_type&&)#1}, seastar::future<seastar::json::json_return_type>::then_impl_nrvo<seastar::json::json_return_type&&, seastar::future<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > > >(seastar::json::json_return_type&&)::{lambda(seastar::internal::promise_base_with_type<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >&&, seastar::json::json_return_type&, seastar::future_state<seastar::json::json_return_type>&&)#1}, seastar::json::json_return_type> + 16 [shard 0] #4 (task) 0x0000600009d86440 0x0000000000812228 vtable for seastar::continuation<seastar::internal::promise_base_with_type<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >, seastar::httpd::function_handler::handle(seastar::basic_sstring<char, unsigned int, 15u, true> const&, std::unique_ptr<seastar::httpd::request, std::default_delete<seastar::httpd::request> >, std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> >)::{lambda(std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> >)#1}, seastar::future<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >::then_impl_nrvo<{lambda(std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> >)#1}, seastar::future>({lambda(std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> >)#1}&&)::{lambda(seastar::internal::promise_base_with_type<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >&&, {lambda(std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> >)#1}&, seastar::future_state<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >&&)#1}, std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > > + 16 [shard 0] #5 (task) 0x0000600009dba0c0 0x0000000000812f48 vtable for seastar::continuation<seastar::internal::promise_base_with_type<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >, seastar::future<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >::handle_exception<std::function<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > (std::__exception_ptr::exception_ptr)>&>(std::function<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > (std::__exception_ptr::exception_ptr)>&)::{lambda(auto:1&&)#1}, seastar::future<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >::then_wrapped_nrvo<seastar::future<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >, {lambda(auto:1&&)#1}>({lambda(auto:1&&)#1}&&)::{lambda(seastar::internal::promise_base_with_type<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >&&, {lambda(auto:1&&)#1}&, seastar::future_state<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >&&)#1}, std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > > + 16 [shard 0] #6 (task) 0x0000600026783ae0 0x00000000008118b0 vtable for seastar::continuation<seastar::internal::promise_base_with_type<bool>, seastar::httpd::connection::generate_reply(std::unique_ptr<seastar::httpd::request, std::default_delete<seastar::httpd::request> >)::{lambda(std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> >)#1}, seastar::future<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >::then_impl_nrvo<{lambda(std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> >)#1}, seastar::httpd::connection::generate_reply(std::unique_ptr<seastar::httpd::request, std::default_delete<seastar::httpd::request> >)::{lambda(std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> >)#1}<bool> >({lambda(std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> >)#1}&&)::{lambda(seastar::internal::promise_base_with_type<bool>&&, {lambda(std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> >)#1}&, seastar::future_state<std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > >&&)#1}, std::unique_ptr<seastar::httpd::reply, std::default_delete<seastar::httpd::reply> > > + 16 [shard 0] #7 (task) 0x000060000a4089c0 0x0000000000811790 vtable for seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::httpd::connection::read_one()::{lambda()#1}::operator()()::{lambda(std::unique_ptr<seastar::httpd::request, std::default_delete<std::unique_ptr> >)#2}::operator()(std::default_delete<std::unique_ptr>) const::{lambda(std::default_delete<std::unique_ptr>)#1}::operator()(std::default_delete<std::unique_ptr>) const::{lambda(bool)#2}, seastar::future<bool>::then_impl_nrvo<{lambda(std::unique_ptr<seastar::httpd::request, std::default_delete<std::unique_ptr> >)#2}, {lambda(std::default_delete<std::unique_ptr>)#1}<void> >({lambda(std::unique_ptr<seastar::httpd::request, std::default_delete<std::unique_ptr> >)#2}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, {lambda(std::unique_ptr<seastar::httpd::request, std::default_delete<std::unique_ptr> >)#2}&, seastar::future_state<bool>&&)#1}, bool> + 16 [shard 0] #8 (task) 0x000060000a5b16e0 0x0000000000811430 vtable for seastar::internal::do_until_state<seastar::httpd::connection::read()::{lambda()#1}, seastar::httpd::connection::read()::{lambda()#2}> + 16 [shard 0] #9 (task) 0x000060000aec1080 0x00000000008116d0 vtable for seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::httpd::connection::read()::{lambda(seastar::future<void>)#3}, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, {lambda(seastar::future<void>)#3}>({lambda(seastar::future<void>)#3}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, {lambda(seastar::future<void>)#3}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void> + 16 [shard 0] #10 (task) 0x000060000b7d2900 0x0000000000811950 vtable for seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::future<void>::finally_body<seastar::httpd::connection::read()::{lambda()#4}, true>, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::httpd::connection::read()::{lambda()#4}>(seastar::httpd::connection::read()::{lambda()#4}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, seastar::httpd::connection::read()::{lambda()#4}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void> + 16 Found no further pointers to task objects. If you think there should be more, run `scylla fiber 0x000060002d650200 --verbose` to learn more. Note that continuation across user-created seastar::promise<> objects are not detected by scylla-fiber. ``` Closes #11822 * github.com:scylladb/scylladb: scylla-gdb.py: collection_element: add support for boost::intrusive::list scylla-gdb.py: optional_printer: eliminate infinite loop scylla-gdb.py: scylla-fiber: add note about user-instantiated promise objects scylla-gdb.py: scylla-fiber: reject self-references when probing pointers scylla-gdb.py: scylla-fiber: add starting task to known tasks scylla-gdb.py: scylla-fiber: add support for walking over when_all scylla-gdb.py: add when_all_state to task type whitelist scylla-gdb.py: scylla-fiber: also print shard of tasks scylla-gdb.py: scylla-fiber: unify task printing scylla-gdb.py: scylla fiber: add support for walking over shards scylla-gdb.py: scylla fiber: add support for walking over seastar threads scylla-gdb.py: scylla-ptr: keep current thread context scylla-gdb.py: improve scylla column_families scylla-gdb.py: scylla_sstables.filename(): fix generation formatting scylla-gdb.py: improve schema_ptr scylla-gdb.py: scylla memory: restore compatibility with <= 5.1	2022-11-03 13:52:31 +02:00
Kamil Braun	2049962e11	Fix version numbers in upgrade page title Closes #11878	2022-11-03 10:06:25 +02:00
Takuya ASADA	45789004a3	install-dependencies.sh: update node_exporter to 1.4.0 To fix CVE-2022-24675, we need to a binary compiled in <= golang 1.18.1. Only released version which compiled <= golang 1.18.1 is node_exporter 1.4.0, so we need to update to it. See scylladb/scylla-enterprise#2317 Closes #11400 [avi: regenerated frozen toolchain] Closes #11879	2022-11-03 10:15:22 +04:00
Yaron Kaikov	20110bdab4	configure.py: remove un-used tar files creation Starting from https://github.com/scylladb/scylla-pkg/pull/3035 we removed all old tar.gz prefix from uploading to S3 or been used by downstream jobs. Hence, there is no point building those tar.gz files anymore Closes #11865	2022-11-02 17:44:09 +02:00
Anna Stuchlik	d1f7cc99bc	doc: fix the external links to the ScyllaDB University lesson about TTL Closes #11876	2022-11-02 15:05:43 +02:00
Nadav Har'El	59fa8fe903	Merge 'doc: add the information about AArch64 support to Requirements' from Anna Stuchlik Fix https://github.com/scylladb/scylla-doc-issues/issues/864 This PR: - updates the introduction to add information about AArch64 and rewrite the content. - replaces "Scylla" with "ScyllaDB". Closes #11778 * github.com:scylladb/scylladb: Update docs/getting-started/system-requirements.rst doc: fix the link to the OS Support page doc: replace Scylla with ScyllaDB doc: update the info about supported architecture and rewrite the introduction	2022-11-02 11:18:20 +02:00
Anna Stuchlik	ea799ad8fd	Update docs/getting-started/system-requirements.rst Co-authored-by: Tzach Livyatan <tzach.livyatan@gmail.com>	2022-11-02 09:56:56 +01:00
guy9	097a65df9f	adding top banner to the Docs website with a link to the ScyllaDB University fall LIVE event Closes #11873	2022-11-02 10:20:40 +02:00
Nadav Har'El	b9d88a3601	cql/pytest: add reproducer for timestamp column validation issue This patch adds a reproducing test for issue #11588, which is still open so the test is expected to fail on Scylla ("xfail), and passes on Cassandra. The test shows that Scylla allows an out-of-range value to be written to timestamp column, but then it can't be read back. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #11864	2022-11-01 08:11:01 +02:00
Botond Dénes	dc46bfa783	Merge 'Prepare repair for task manager integration' from Aleksandra Martyniuk The PR prepares repair for task manager integration: - Creates repair_module - Keeps repair_module in repair_service - Moves tracker methods to repair_module - Changes UUID to task_id in repair module Closes #11851 * github.com:scylladb/scylladb: repair: check shutdown with abort source in repair module repair: use generic module gate for repair module operations repair: move tracker to repair module repair: move next_repair_command to repair_module repair: generate repair id in repair module repair: keep shard number in repair_uniq_id repair: change UUID to task_id repair: add task_manager::module to repair_service repair: create repair module and task	2022-11-01 08:05:14 +02:00
Aleksandra Martyniuk	f2fe586f03	repair: check shutdown with abort source in repair module In repair module the shutdown can be checked using abort_source. Thus, we can get rid of shutdown flag.	2022-10-31 10:57:29 +01:00
Aleksandra Martyniuk	2d878cc9b5	repair: use generic module gate for repair module operations Repair module uses a gate to prevent starting new tasks on shutdown. Generic module's gate serves the same purpose, thus we can use it also in repair specific context.	2022-10-31 10:56:36 +01:00
Aleksandra Martyniuk	4aae7e9026	repair: move tracker to repair module Since both tracker and repair_module serve similar purpose, it is confusing where we should seek for methods connected to them. Thus, to make it more transparent, tracker class is deleted and all its attributes and methods are moved to repair_module.	2022-10-31 10:55:36 +01:00
Aleksandra Martyniuk	a5c05dcb60	repair: move next_repair_command to repair_module Number of the repair operation was counted both with next_repair_command from tracer and sequence number from task_manager::module. To get rid of redundancy next_repair_command was deleted and all methods using its value were moved to repair_module.	2022-10-31 10:54:39 +01:00
Aleksandra Martyniuk	c81260fb8b	repair: generate repair id in repair module repair_uniq_id for repair task can be generated in repair module and accessed from the task.	2022-10-31 10:54:24 +01:00
Aleksandra Martyniuk	6432a26ccf	repair: keep shard number in repair_uniq_id Execution shard is one of the traits specific to repair tasks. Child task should freely access shard id of its parent. Thus, the shard number is kept in a repair_uniq_id struct.	2022-10-31 10:41:17 +01:00
guy9	276ec377c0	removed broken roadmap link Closes #11854	2022-10-31 11:33:03 +02:00
Aleksandra Martyniuk	e2c7c1495d	repair: change UUID to task_id Change type of repair id from utils::UUID to task_id to distinguish them from ids of other entities.	2022-10-31 10:07:08 +01:00
Aleksandra Martyniuk	dc80af33bc	repair: add task_manager::module to repair_service repair_service keeps a shared pointer to repair_module.	2022-10-31 10:04:50 +01:00
Aleksandra Martyniuk	576277384a	repair: create repair module and task Create repair_task_impl and repair_module inheriting from respectively task manager task_impl and module to integrate repair operations with task manager.	2022-10-31 10:04:48 +01:00
Takuya ASADA	159bc7c7ea	install-dependencies.sh: use binary distributions of PIP package We currently avoid compiling C code in "pip3 install scylla-driver", but we actually providing portable binary distributions of the package, so we should use it by "pip3 install --only-binary=:all: scylla-driver". The binary distribution contains dependency libraries, so we won't have problem loading it on relocatable python3. Closes #11852	2022-10-31 10:38:36 +02:00
Kamil Braun	db6cc035ed	test/raft: raft_address_map_test: add replication test	2022-10-31 09:17:12 +01:00
Kamil Braun	7d84007fd5	service/raft: raft_address_map: replicate non-expiring entries to other shards Replicating `raft_address_map` entries is needed for the following use cases: - the direct failure detector - currently it assumes a static mapping of `raft::server_id`s to `gms::inet_address`es, which is obtained on Raft group 0 configuration changes. To handle dynamic mappings we need to modify the failure detector so it pings `raft::server_id`s and obtains the `gms::inet_address` before sending the message from `raft_address_map`. The failure detector is sharded, so we need the mappings to be available on all shards. - in the future we'll have multiple Raft groups running on different shards. To send messages they'll need `raft_address_map`. Initially I tried to replicate all entries - expiring and non-expiring. The implementation turned out to be very complex - we need to handle dropping expired entries and refreshing expiring entries' timestamps across shards, and doing this correctly while accounting for possible races is quite problematic. Eventually I arrived at the conclusion that replicating only non-expiring entries, and furthermore allowing non-expiring entries to be added only on shard 0, is good enough for our use cases: - The direct failure detector is pinging group 0 members only; group 0 members correspond exactly to the non-expiring entries. - Group 0 configuration changes are handled on shard 0, so non-expiring entries are added/removed on shard 0. - When we have multiple Raft groups, we can reuse a single Raft server ID for all Raft servers running on a single node belonging to different groups; they are 'namespaced' by the group IDs. Furthermore, every node has a server that belongs to group 0. Thus for every Raft server in every group, it has a corresponding server in group 0 with the same ID, which has a non-expiring entry in `raft_address_map`, which is replicated to all shards; so every group will be able to deliver its messages. With these assumptions the implementation is short and simple. We can always complicate it in the future if we find that the assumptions are too strong.	2022-10-31 09:17:12 +01:00
Kamil Braun	acacbad465	service/raft: raft_address_map: assert when entry is missing in drop_expired_entries	2022-10-31 09:17:12 +01:00
Kamil Braun	159bb32309	service/raft: turn raft_address_map into a service	2022-10-31 09:17:10 +01:00
Botond Dénes	139fbb466e	Merge 'Task manager extension' from Aleksandra Martyniuk The PR adds changes to task manager that allow more convenient integration with modules. Introduced changes: - adds internal flag in task::impl that allows user to filter too specific tasks - renames `parent_data` to more appropriate name `task_info` - creates `tasks/types.hh` which allows using some types connected with task manager without the necessity to include whole task manager - adds more flexible version of `make_task` method Closes #11821 * github.com:scylladb/scylladb: tasks: add alternative make_task method tasks: rename parent_data to task_info and move it tasks: move task_id to tasks/types.hh tasks: add internal flag for task_manager::task::impl	2022-10-31 09:57:10 +02:00
Botond Dénes	2c021affd1	Merge 'storage_service, repair: use per-shard abort_source' from Benny Halevy Prevent copying shared_ptr across shards in do_sync_data_using_repair by allocating a shared_ptr<abort_source> per shard in node_ops_meta_data and respectively in node_ops_info. Fixes #11826 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #11827 * github.com:scylladb/scylladb: repair: use sharded abort_source to abort repair_info repair: node_ops_info: add start and stop methods storage_service: node_ops_abort_thread: abort all node ops on shutdown storage_service: node_ops_abort_thread: co_return only after printing log message storage_service: node_ops_meta_data: add start and stop methods repair: node_ops_info: prevent accidental copy	2022-10-31 09:43:34 +02:00
Botond Dénes	63a90cfb6c	scylla-gdb.py: collection_element: add support for boost::intrusive::list	2022-10-31 08:18:20 +02:00
Botond Dénes	2fa1864174	scylla-gdb.py: optional_printer: eliminate infinite loop Currently, to_string() recursively calls itself for engaged optionals. Eliminate it. Also, use the std_optional wrapper instead of accessing std::optional internals directly.	2022-10-31 08:18:20 +02:00
Botond Dénes	77b2555a04	scylla-gdb.py: scylla-fiber: add note about user-instantiated promise objects Scylla fiber uses a crude method of scanning inbound and outbound references to/from other task objects of recognized type. This method cannot detect user instantiated promise<> objects. Add a note about this to the printout, so users are beware of this.	2022-10-31 08:18:20 +02:00
Botond Dénes	2276565a2e	scylla-gdb.py: scylla-fiber: reject self-references when probing pointers A self-reference is never the pointer we are looking for when looking for other tasks referencing us. Reject such references when scanning outright.	2022-10-31 08:18:20 +02:00
Botond Dénes	f4365dd7f5	scylla-gdb.py: scylla-fiber: add starting task to known tasks We collect already seen tasks in a set to be able to detect perceived task loops and stop when one is seen. Initialize this set with the starting task, so if it forms a loop, we won't repeat it in the trace before cutting the loop.	2022-10-31 08:18:20 +02:00
Botond Dénes	48bbf2e467	scylla-gdb.py: scylla-fiber: add support for walking over when_all	2022-10-31 08:18:20 +02:00
Botond Dénes	cb8f02e24b	scylla-gdb.py: add when_all_state to task type whitelist	2022-10-31 08:18:20 +02:00
Botond Dénes	62621abc44	scylla-gdb.py: scylla-fiber: also print shard of tasks Now that scylla-fiber can cross shards, it is important to display the shard each task in the chain lives on.	2022-10-31 08:18:19 +02:00
Botond Dénes	c21c80f711	scylla-gdb.py: scylla-fiber: unify task printing Currently there is two loops and a separate line printing the starting task, all duplicating the formatting logic. Define a method for it and use it in all 3 places instead.	2022-10-31 08:18:19 +02:00
Botond Dénes	c103280bfd	scylla-gdb.py: scylla fiber: add support for walking over shards Shard boundaries can be crossed in one direction currently: when looking for waiters on a task, but not in the other direction (looking for waited-on tasks). This patch fixes that.	2022-10-31 08:18:19 +02:00
Botond Dénes	437f888ba0	scylla-gdb.py: scylla fiber: add support for walking over seastar threads Currently seastar threads end any attempt to follow waited-on-futures. Seastar threads need special handling because it allocates the wake up task on its stack. This patch adds this special handling.	2022-10-31 08:18:19 +02:00
Botond Dénes	fcc63965ed	scylla-gdb.py: scylla-ptr: keep current thread context scylla_ptr.analyze() switches to the thread the analyzed object lives on, but forgets to switch back. This was very annoying as any commands using it (which is a bunch of them) were prone to suddenly and unexpectedly switching threads. This patch makes sure that the original thread context is switched back to after analyzing the pointer.	2022-10-31 08:18:19 +02:00
Botond Dénes	91516c1d68	scylla-gdb.py: improve scylla column_families Rename to scylla tables. Less typing and more up-to-date. By default it now only lists tables from local shard. Added flag -a which brings back old behaviour (lists on all shards). Added -u (only list user tables) and -k (list tables of provided keyspace only) filtering options.	2022-10-31 08:18:19 +02:00
Botond Dénes	1d3d613b76	scylla-gdb.py: scylla_sstables.filename(): fix generation formatting Generation was recently converted from an integer to an object. Update the filename formatting, while keeping backward compatibility.	2022-10-31 08:18:19 +02:00
Botond Dénes	c869f54742	scylla-gdb.py: improve schema_ptr Add __getitem__(), so members can be accessed. Strip " from ks_name and cf_name. Add is_system().	2022-10-31 08:18:19 +02:00
Botond Dénes	66832af233	scylla-gdb.py: scylla memory: restore compatibility with <= 5.1 Recent reworks around dirty memory manager broke backward compatibility of the scylla memory command (and possibly others). This patch restores it.	2022-10-31 08:18:19 +02:00
Tenghuan He	e0948ba199	Add directory change instruction Add directory change instruction while building scylla Closes #11717	2022-10-30 23:53:02 +02:00
Pavel Emelyanov	477e0c967a	scylla-gdb: Evaluate LSA object sizes dynamically The lsa-segment command tries to walk LSA segment objects by decoding their descriptors and (!) object sizes as well. Some objects in LSA have dynamic sizes, i.e. those depending on the object contents. The script tries to drill down the object internals to get this size, but bad news is that nowadays there are many dynamic objects that are not covered. Once stepped upon unsupported object, scylla-gdb likely stops because the "next" descriptor happens to be in the middle of the object and its parsing throws. This patch fixes this by taking advantage of the virtual size() call of the migrate_fn_type all LSA objects are linked with (indirectly). It gets the migrator object, the LSA object itself and calls ((migrate_fn_type)<migrator_ptr>)->size((const void)<object_ptr>) with gdb. The evaluated value is the live dynamic size of the object. fixes: #11792 refs: #2455 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #11847	2022-10-28 14:11:30 +03:00
Botond Dénes	74c9aa3a3f	Merge 'removenode: allow specifying nodes to ignore using host_id' from Benny Halevy Currently, when specifying nodes to ignore for replace or removenode, we support specifying them only using their ip address. As discussed in https://github.com/scylladb/scylladb/issues/11839 for removenode, we intentionally require the host uuid for specifying the node to remove, so the nodes to ignore (that are also done, otherwise we need not ignore them), should be consistent with that and be specified using their host_id. The series extends the apis and allows either the nodes ip address or their host_id to be specified, for backward compatibility. We should deprecate the ip address method over time and convert the tests and management software to use the ignored nodes' host_id:s instead. Closes #11841 * github.com:scylladb/scylladb: api: doc: remove_node: improve summary api, service: storage_service: removenode: allow passing ignore_nodes as uuid:s storage_service: get_ignore_dead_nodes_for_replace: use tm.parse_host_id_and_endpoint locator: token_metadata: add parse_host_id_and_endpoint api: storage_service: remove_node: validate host_id	2022-10-28 13:35:04 +03:00
Benny Halevy	335a8cc362	api: doc: remove_node: improve summary The current summary of the operation is obscure. It refers to a token in the ring and the endpoint associated with it, while the operation uses a host_id to identify a whole node. Instead, clarify the summary to refer to a node in the cluster, consistent with the description for the host_id parameter. Also, describe the effect the call has on the data the removed node logically owned. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-28 07:52:37 +03:00
Benny Halevy	9ef2631ec2	api, service: storage_service: removenode: allow passing ignore_nodes as uuid:s Currently the api is inconsistent: requiring a uuid for the host_id of the node to be removed, while the ignored nodes list is given as comma-separated ip addresses. Instead, support identifying the ignored_nodes either by their host_id (uuid) or ip address. Also, require all ignore_nodes to be of the same kind: either UUIDs or ip addresses, as a mix of the 2 is likely indicating a user error. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-28 07:49:03 +03:00
Benny Halevy	40cd685371	storage_service: get_ignore_dead_nodes_for_replace: use tm.parse_host_id_and_endpoint Allow specifying the dead node to ignore either as host_id or ip address. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-28 07:38:13 +03:00
Benny Halevy	b74807cb8a	locator: token_metadata: add parse_host_id_and_endpoint To be used for specifying nodes either by their host_id or ip address and using the token_metadata to resolve the mapping. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-28 07:38:13 +03:00
Benny Halevy	340a5a0c94	api: storage_service: remove_node: validate host_id The node to be removed must be identified by its host_id. Validate that at the api layer and pass the parsed host_id down to storage_service::removenode. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-28 07:38:13 +03:00
Takuya ASADA	464b5de99b	scylla_setup: allow symlink to --disks option Currently, --disks options does not allow symlinks such as /dev/disk/by-uuid/* or /dev/disk/azure/*. To allow using them, is_unused_disk() should resolve symlink to realpath, before evaluating the disk path. Fixes #11634 Closes #11646	2022-10-28 07:24:11 +03:00
Botond Dénes	b744036840	Merge 'scylla_util.py: on sysconfig_parser, don't use double quote when it's possible' from Takuya ASADA It seems like distribution original sysconfig files does not use double quote to set the parameter when the value does not contain space. Adding function to detect spaces in the value, don't usedouble quote when it not detected. Fixes #9149 Closes #9153 * github.com:scylladb/scylladb: scylla_util.py: adding unescape for sysconfig_parser scylla_util.py: on sysconfig_parser, don't use double quote when it's possible	2022-10-28 07:19:13 +03:00
Benny Halevy	44e1058f63	docs: nodetool/removenode: fix host_id in examples removenode host_id must specify the host ID as a UUID, not an ip address. Fixes #11839 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #11840	2022-10-27 14:29:36 +03:00
Pavel Emelyanov	7b193ab0a5	messaging_service: Deny putting INADD_ANY as preferred ip Even though previous patch makes scylla not gossip this as internal_ip, an extra sanity check may still be useful. E.g. older versions of scylla may still do it, or this address can be loaded from system_keyspace. refs: #11502 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-27 14:25:43 +03:00
Pavel Emelyanov	aa7a759ac9	messaging_service: Toss preferred ip cache management Make it call cache_preferred_ip() even when the cache is loaded from system_keyspace and move the connection reset there. This is mainly to prepare for the next patch, but also makes the code a bit shorter Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-27 14:25:43 +03:00
Pavel Emelyanov	91b460f1c4	gossiping_property_file_snitch: Dont gossip INADDR_ANY preferred IP Gossiping 0.0.0.0 as preferred IP may break the peer as it will "interpret" this address as <myself> which is not what peer expects. However, g.p.f.s. uses --listen-address argument as the internal IP and it's not prohibited to configure it to be 0.0.0.0 It's better not to gossip the INTERNAL_IP property at all if the listen address is such. fixes: #11502 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-27 14:25:43 +03:00
Pavel Emelyanov	99579bd186	gossiping_property_file_snitch: Make _listen_address optional As the preparation for the next patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-10-27 14:15:26 +03:00
Benny Halevy	0ea8250e83	repair: use sharded abort_source to abort repair_info Currently we use a single shared_ptr<abort_source> that can't be copied across shards. Instead, use a sharded<abort_source> in node_ops_info so that each repair_info instance will use an (optional) abort_source* on its own shard. Added respective start and stop methodsm plus a local_abort_source getter to get the shard-local abort_source (if available). Fixes #11826 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-27 12:18:30 +03:00
Benny Halevy	88f993e5ed	repair: node_ops_info: add start and stop methods Prepare for adding a sharded<abort_source> member. Wire start/stop in storage_service::node_ops_meta_data. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-27 12:18:30 +03:00
Benny Halevy	c2f384093d	storage_service: node_ops_abort_thread: abort all node ops on shutdown A later patch adds a sharded<abort_source> to node_ops_info. On shutdown, we must orderly stop it, so use node_ops_abort_thread shutdown path (where node_ops_singal_abort is called will a nullopt) to abort (and stop) all outstanding node_ops by passing a null_uuid to node_ops_abort, and let it iterate over all node ops to abort and stop them. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-27 12:14:06 +03:00
Benny Halevy	0efd290378	storage_service: node_ops_abort_thread: co_return only after printing log message Currently the function co_returns if (!uuid_opt) so the log info message indicating it's stopped is not printed. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-27 12:14:03 +03:00
Benny Halevy	47e4761b4e	storage_service: node_ops_meta_data: add start and stop methods Prepare for starting and stopping repair node_ops_info Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-27 12:14:03 +03:00
Benny Halevy	5c25066ea7	repair: node_ops_info: prevent accidental copy Delete node_ops_info copy and move constructors before we add a sharded<abort_source> member for the per-shard repairs in the next patch. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-10-27 12:14:03 +03:00
Takuya ASADA	cd6030d5df	scylla_util.py: adding unescape for sysconfig_parser Even we have __escape() for escaping " middle of the value to writing sysconfig file, we didn't unescape for reading from sysconfig file. So adding __unescape() and call it on get().	2022-10-27 16:39:47 +09:00
Takuya ASADA	de57433bcf	scylla_util.py: on sysconfig_parser, don't use double quote when it's possible It seems like distribution original sysconfig files does not use double quote to set the parameter when the value does not contain space. Adding function to detect spaces in the value, don't usedouble quote when it not detected. Fixes #9149	2022-10-27 16:36:27 +09:00
Aleksandra Martyniuk	6494de9bb0	tasks: add alternative make_task method Task manager tasks should be created with make_task method since it properly sets information about child-parent relationship between tasks. Though, sometimes we may want to keep additional task data in classes inheriting from task_manager::task::impl. Doing it with existing make_task method makes it impossible since implementation objects are created internally. The commit adds a new make_task that allows to provide a task implementation pointer created by caller. All the fields except for the one connected with children and parent should be set before.	2022-10-26 14:01:05 +02:00
Aleksandra Martyniuk	10d11a7baf	tasks: rename parent_data to task_info and move it parent_data struct contains info that is common for each task, not only in parent-child relationship context. To use it this way without confusion, its name is changed to task_info. In order to be able to widely and comfortably use task_info, it is moved from tasks/task_manager.hh to tasks/types.hh and slightly extended.	2022-10-26 14:01:05 +02:00
Aleksandra Martyniuk	9ecc2047ac	tasks: move task_id to tasks/types.hh	2022-10-26 14:01:05 +02:00
Aleksandra Martyniuk	e2e8a286cc	tasks: add internal flag for task_manager::task::impl It is convenient to create many different tasks implementations representing more and more specific parts of the operation in a module. Presenting all of them through the api makes it cumbersome for user to navigate and track, though. Flag internal is added to task_manager::task::impl so that the tasks could be filtered before they are sent to user.	2022-10-26 14:01:05 +02:00
Pavel Emelyanov	e245780d56	gossiper: Request topology states in shadow round When doing shadow round for replacement the bootstrapping node needs to know the dc/rack info about the node it replaces to configure it on topology. This topology info is later used by e.g. repair service. fixes: #11829 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #11838	2022-10-25 13:21:20 +03:00
Pavel Emelyanov	64c9359443	storage_proxy: Don't use default-initialized endpoint in get_read_executor() After calling filter_for_query() the extra_replica to speculate to may be left default-initialized which is :0 ipv6 address. Later below this address is used as-is to check if it belongs to the same DC or not which is not nice, as :0 is not an address of any existing endpoint. Recent move of dc/rack data onto topology made this place reveal itself by emitting the internal error due to :0 not being present on the topology's collection of endpoints. Prior to this move the dc filter would count :0 as belonging to "default_dc" datacenter which may or may not match with the dc of the local node. The fix is to explicitly tell set extra_replica from unset one. fixes: #11825 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #11833	2022-10-25 09:16:50 +03:00
Takuya ASADA	1a11a38add	unified: move unified package contents to sub-directory On most of the software distribution tar.gz, it has sub-directory to contain everything, to prevent extract contents to current directory. We should follow this style on our unified package too. To do this we need to increment relocatable package version to '3.0'. Fixes #8349 Closes #8867	2022-10-25 08:58:15 +03:00
Takuya ASADA	a938b009ca	scylla_raid_setup: run uuidpath existance check only after mount failed We added UUID device file existance check on #11399, we expect UUID device file is created before checking, and we wait for the creation by "udevadm settle" after "mkfs.xfs". However, we actually getting error which says UUID device file missing, it probably means "udevadm settle" doesn't guarantee the device file created, on some condition. To avoid the error, use var-lib-scylla.mount to wait for UUID device file is ready, and run the file existance check when the service is failed. Fixes #11617 Closes #11666	2022-10-25 08:54:21 +03:00
Yaniv Kaul	cec21d10ed	docs: Fix typo (patch -> batch) See subject. Closes #11837	2022-10-25 08:50:44 +03:00
Michał Radwański	36508bf5e9	serializer_impl: remove unneeded generic parameter Input stream used in vector_deserializer doesn't need to be generic, as there is only one implementation used.	2022-10-24 17:21:38 +02:00
Tomasz Grabiec	687df05e28	db: make_forwardable::reader: Do not emit range_tombstone_change with position past the range Since the end bound is exclusive, the end position should be before_key(), not after_key(). Affects only tests, as far as I know, only there we can get an end bound which is a clustering row position. Would cause failures once row cache is switched to v2 representation because of violated assumptions about positions. Introduced in `76ee3f029c` Closes #11823	2022-10-24 17:06:52 +03:00
Anna Stuchlik	9f7536d549	doc: fix the link to the OS Support page	2022-10-13 15:36:51 +02:00
Anna Stuchlik	1fd1ce042a	doc: replace Scylla with ScyllaDB	2022-10-13 15:21:46 +02:00
Anna Stuchlik	81ce7a88de	doc: update the info about supported architecture and rewrite the introduction	2022-10-13 15:18:29 +02:00
Anna Stuchlik	3950a1cac8	doc: apply the feedback to improve clarity	2022-10-03 11:14:51 +02:00
Anna Stuchlik	46f0e99884	doc: add the link to the new Troubleshooting section and replace Scylla with ScyllaDB	2022-09-23 11:46:15 +02:00
Anna Stuchlik	af2a85b191	doc: add the new page to the toctree	2022-09-23 11:37:38 +02:00
Anna Stuchlik	b034e2856e	doc: add a troubleshooting article about the missing configuration files	2022-09-23 11:17:18 +02:00
Anna Stuchlik	260f85643d	doc: specify the recommended AWS instance types	2022-08-08 14:35:54 +02:00
Anna Stuchlik	2c69a8f458	doc: replace the tables with a generic description of support for Im4gn and Is4gen instances	2022-08-08 14:17:59 +02:00
Anna Stuchlik	ceaf0c41bd	doc: add support for AWS i4g instances	2022-08-05 17:18:44 +02:00
Anna Stuchlik	7711436577	doc: extend the list of supported CPUs	2022-08-05 16:55:40 +02:00
Anna Stuchlik	844c875f15	doc: add info about the time-consuming step due to resharding	2022-07-26 14:52:11 +02:00
Anna Stuchlik	ff5c4a33f5	doc: add the new KB to the toctree	2022-07-25 14:29:33 +02:00
Anna Stuchlik	f1daef4b1b	doc: doc: add a KB about updating the mode in perftune.yaml after upgrade	2022-07-25 14:22:02 +02:00

1493 changed files with 74409 additions and 35958 deletions

24

.github/CODEOWNERS vendored

View File

@@ -12,7 +12,7 @@ test/cql/cdc_* @kbr- @elcallio @piodul @jul-stas
 test/boost/cdc_* @kbr- @elcallio @piodul @jul-stas
 # COMMITLOG / BATCHLOG
 db/commitlog/* @elcallio
 db/commitlog/* @elcallio @eliransin
 db/batch* @elcallio
 # COORDINATOR
@@ -25,7 +25,7 @@ compaction/* @raphaelsc @nyh
 transport/*
 # CQL QUERY LANGUAGE
 cql3/* @tgrabiec @psarna @cvybhu
 cql3/* @tgrabiec @cvybhu @nyh
 # COUNTERS
 counters* @jul-stas
@@ -33,7 +33,7 @@ tests/counter_test* @jul-stas
 # DOCS
 docs/* @annastuchlik @tzach
 docs/alternator @annastuchlik @tzach @nyh @psarna
 docs/alternator @annastuchlik @tzach @nyh @havaker @nuivall
 # GOSSIP
 gms/* @tgrabiec @asias
@@ -45,9 +45,9 @@ dist/docker/*
 utils/logalloc* @tgrabiec
 # MATERIALIZED VIEWS
 db/view/* @nyh @psarna
 cql3/statements/*view* @nyh @psarna
 test/boost/view_* @nyh @psarna
 db/view/* @nyh @cvybhu @piodul
 cql3/statements/*view* @nyh @cvybhu @piodul
 test/boost/view_* @nyh @cvybhu @piodul
 # PACKAGING
 dist/* @syuu1228
@@ -62,9 +62,9 @@ service/migration* @tgrabiec @nyh
 schema* @tgrabiec @nyh
 # SECONDARY INDEXES
 db/index/* @nyh @psarna
 cql3/statements/*index* @nyh @psarna
 test/boost/*index* @nyh @psarna
 index/* @nyh @cvybhu @piodul
 cql3/statements/*index* @nyh @cvybhu @piodul
 test/boost/*index* @nyh @cvybhu @piodul
 # SSTABLES
 sstables/* @tgrabiec @raphaelsc @nyh
@@ -74,11 +74,11 @@ streaming/* @tgrabiec @asias
 service/storage_service.* @tgrabiec @asias
 # ALTERNATOR
 alternator/* @nyh @psarna
 test/alternator/* @nyh @psarna
 alternator/* @nyh @havaker @nuivall
 test/alternator/* @nyh @havaker @nuivall
 # HINTED HANDOFF
 db/hints/* @piodul @vladzcloudius
 db/hints/* @piodul @vladzcloudius @eliransin
 # REDIS
 redis/* @nyh @syuu1228

									
										17

.github/workflows/docs-amplify-enhanced.yaml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,17 @@

				name: "Docs / Amplify enhanced"

				on: issue_comment

				jobs:

				  build:

				    runs-on: ubuntu-latest

				    if: ${{ github.event.issue.pull_request }}

				    steps:

				      - name: Checkout

				        uses: actions/checkout@v3

				        with:

				          fetch-depth: 0

				      - name: Amplify enhanced

				        env:

				          TOKEN: ${{ secrets.GITHUB_TOKEN }}

				        uses: scylladb/sphinx-scylladb-theme/.github/actions/amplify-enhanced@master

									
										13

.github/workflows/docs-pages.yaml
									
										vendored
									
												View File
												
				@@ -2,10 +2,14 @@ name: "Docs / Publish"

				# For more information,

				# see https://sphinx-theme.scylladb.com/stable/deployment/production.html#available-workflows

				env:

				  FLAG: ${{ github.repository == 'scylladb/scylla-enterprise' && 'enterprise' || 'opensource' }}

				on:

				  push:

				    branches:

				      - master

				      - 'master'

				      - 'enterprise'

				    paths:

				      - "docs/**"

				  workflow_dispatch:

				@@ -24,12 +28,13 @@ jobs:

				        with:

				          python-version: 3.7

				      - name: Set up env

				        run: make -C docs setupenv

				        run: make -C docs FLAG="${{ env.FLAG }}" setupenv

				      - name: Build docs

				        run: make -C docs multiversion

				        run: make -C docs FLAG="${{ env.FLAG }}" multiversion

				      - name: Build redirects

				        run: make -C docs redirects

				        run: make -C docs FLAG="${{ env.FLAG }}" redirects

				      - name: Deploy docs to GitHub Pages

				        run: ./docs/_utils/deploy.sh

				        if: (github.ref_name == 'master' && env.FLAG == 'opensource') || (github.ref_name == 'enterprise' && env.FLAG == 'enterprise')

				        env:

				          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

									
										8

.github/workflows/docs-pr.yaml
									
										vendored
									
												View File
												
				@@ -2,10 +2,14 @@ name: "Docs / Build PR"

				# For more information,

				# see https://sphinx-theme.scylladb.com/stable/deployment/production.html#available-workflows

				env:

				  FLAG: ${{ github.repository == 'scylladb/scylla-enterprise' && 'enterprise' || 'opensource' }}

				on:

				  pull_request:

				    branches:

				      - master

				      - enterprise

				    paths:

				      - "docs/**"

				@@ -23,6 +27,6 @@ jobs:

				        with:

				          python-version: 3.7

				      - name: Set up env

				        run: make -C docs setupenv

				        run: make -C docs FLAG="${{ env.FLAG }}" setupenv

				      - name: Build docs

				        run: make -C docs test

				        run: make -C docs FLAG="${{ env.FLAG }}" test

1

.gitignore vendored

View File

@@ -32,4 +32,3 @@ compile_commands.json
 .ccls-cache/
 .mypy_cache
 .envrc
 rust/Cargo.lock

9

.gitmodules vendored

View File

@@ -6,12 +6,6 @@
 	path = swagger-ui
 	url = ../scylla-swagger-ui
 	ignore = dirty
 [submodule "libdeflate"]
 	path = libdeflate
 	url = ../libdeflate
 [submodule "abseil"]
 	path = abseil
 	url = ../abseil-cpp
 [submodule "scylla-jmx"]
 	path = tools/jmx
 	url = ../scylla-jmx
@@ -21,3 +15,6 @@
 [submodule "scylla-python3"]
 	path = tools/python3
 	url = ../scylla-python3
 [submodule "tools/cqlsh"]
 	path = tools/cqlsh
 	url = ../scylla-cqlsh

									
										883

CMakeLists.txt
									
												View File
												
				@@ -2,803 +2,200 @@ cmake_minimum_required(VERSION 3.18)

				project(scylla)

				if(NOT CMAKE_BUILD_TYPE AND NOT CMAKE_CONFIGURATION_TYPES)

				  message(STATUS "Setting build type to 'Release' as none was specified.")

				  set(CMAKE_BUILD_TYPE "Release" CACHE

				      STRING "Choose the type of build." FORCE)

				  # Set the possible values of build type for cmake-gui

				  set_property(CACHE CMAKE_BUILD_TYPE PROPERTY STRINGS

				    "Debug" "Release" "Dev" "Sanitize")

				endif()

				include(CTest)

				if(CMAKE_BUILD_TYPE)

				    string(TOLOWER "${CMAKE_BUILD_TYPE}" BUILD_TYPE)

				else()

				    set(BUILD_TYPE "release")

				endif()

				function(default_target_arch arch)

				    set(x86_instruction_sets i386 i686 x86_64)

				    if(CMAKE_SYSTEM_PROCESSOR IN_LIST x86_instruction_sets)

				        set(${arch} "westmere" PARENT_SCOPE)

				    elseif(CMAKE_SYSTEM_PROCESSOR EQUAL "aarch64")

				        set(${arch} "armv8-a+crc+crypto" PARENT_SCOPE)

				    else()

				        set(${arch} "" PARENT_SCOPE)

				    endif()

				endfunction()

				default_target_arch(target_arch)

				if(target_arch)

				    set(target_arch_flag "-march=${target_arch}")

				endif()

				set(cxx_coro_flag)

				if (CMAKE_CXX_COMPILER_ID MATCHES GNU)

				    set(cxx_coro_flag -fcoroutines)

				endif()

				list(APPEND CMAKE_MODULE_PATH

				  ${CMAKE_CURRENT_SOURCE_DIR}/cmake

				  ${CMAKE_CURRENT_SOURCE_DIR}/seastar/cmake)

				set(CMAKE_BUILD_TYPE "${CMAKE_BUILD_TYPE}" CACHE

				    STRING "Choose the type of build." FORCE)

				# Set the possible values of build type for cmake-gui

				set_property(CACHE CMAKE_BUILD_TYPE PROPERTY STRINGS

				  "Debug" "Release" "Dev" "Sanitize")

				string(TOUPPER "${CMAKE_BUILD_TYPE}" build_mode)

				include(mode.${build_mode})

				include(mode.common)

				add_compile_definitions(

				    ${Seastar_DEFINITIONS_${build_mode}}

				    FMT_DEPRECATED_OSTREAM)

				include(limit_jobs)

				# Configure Seastar compile options to align with Scylla

				set(Seastar_CXX_FLAGS ${cxx_coro_flag} ${target_arch_flag} CACHE INTERNAL "" FORCE)

				set(Seastar_CXX_DIALECT gnu++20 CACHE INTERNAL "" FORCE)

				set(CMAKE_CXX_STANDARD "20" CACHE INTERNAL "")

				set(CMAKE_CXX_EXTENSIONS ON CACHE INTERNAL "")

				set(CMAKE_CXX_VISIBILITY_PRESET hidden)

				set(Seastar_TESTING ON CACHE BOOL "" FORCE)

				add_subdirectory(seastar)

				add_subdirectory(abseil)

				# Exclude absl::strerror from the default "all" target since it's not

				# used in Scylla build and, moreover, makes use of deprecated glibc APIs,

				# such as sys_nerr, which are not exposed from "stdio.h" since glibc 2.32,

				# which happens to be the case for recent Fedora distribution versions.

				#

				# Need to use the internal "absl_strerror" target name instead of namespaced

				# variant because `set_target_properties` does not understand the latter form,

				# unfortunately.

				set_target_properties(absl_strerror PROPERTIES EXCLUDE_FROM_ALL TRUE)

				# System libraries dependencies

				find_package(Boost COMPONENTS filesystem program_options system thread regex REQUIRED)

				find_package(Boost REQUIRED

				    COMPONENTS filesystem program_options system thread regex unit_test_framework)

				find_package(Lua REQUIRED)

				find_package(ZLIB REQUIRED)

				find_package(ICU COMPONENTS uc REQUIRED)

				find_package(ICU COMPONENTS uc i18n REQUIRED)

				find_package(absl COMPONENTS hash raw_hash_set REQUIRED)

				find_package(libdeflate REQUIRED)

				find_package(libxcrypt REQUIRED)

				find_package(Snappy REQUIRED)

				find_package(RapidJSON REQUIRED)

				find_package(Thrift REQUIRED)

				find_package(xxHash REQUIRED)

				set(scylla_build_dir "${CMAKE_BINARY_DIR}/build/${BUILD_TYPE}")

				set(scylla_gen_build_dir "${scylla_build_dir}/gen")

				file(MAKE_DIRECTORY "${scylla_build_dir}" "${scylla_gen_build_dir}")

				set(scylla_gen_build_dir "${CMAKE_BINARY_DIR}/gen")

				file(MAKE_DIRECTORY "${scylla_gen_build_dir}")

				# Place libraries, executables and archives in ${buildroot}/build/${mode}/

				foreach(mode RUNTIME LIBRARY ARCHIVE)

				    set(CMAKE_${mode}_OUTPUT_DIRECTORY "${scylla_build_dir}")

				endforeach()

				# Generate C++ source files from thrift definitions

				function(scylla_generate_thrift)

				    set(one_value_args TARGET VAR IN_FILE OUT_DIR SERVICE)

				    cmake_parse_arguments(args "" "${one_value_args}" "" ${ARGN})

				    get_filename_component(in_file_name ${args_IN_FILE} NAME_WE)

				    set(aux_out_file_name ${args_OUT_DIR}/${in_file_name})

				    set(outputs

				        ${aux_out_file_name}_types.cpp

				        ${aux_out_file_name}_types.h

				        ${aux_out_file_name}_constants.cpp

				        ${aux_out_file_name}_constants.h

				        ${args_OUT_DIR}/${args_SERVICE}.cpp

				        ${args_OUT_DIR}/${args_SERVICE}.h)

				    add_custom_command(

				        DEPENDS

				            ${args_IN_FILE}

				            thrift

				        OUTPUT ${outputs}

				        COMMAND ${CMAKE_COMMAND} -E make_directory ${args_OUT_DIR}

				        COMMAND thrift -gen cpp:cob_style,no_skeleton -out "${args_OUT_DIR}" "${args_IN_FILE}")

				    add_custom_target(${args_TARGET}

				        DEPENDS ${outputs})

				    set(${args_VAR} ${outputs} PARENT_SCOPE)

				endfunction()

				scylla_generate_thrift(

				    TARGET scylla_thrift_gen_cassandra

				    VAR scylla_thrift_gen_cassandra_files

				    IN_FILE "${CMAKE_SOURCE_DIR}/interface/cassandra.thrift"

				    OUT_DIR ${scylla_gen_build_dir}

				    SERVICE Cassandra)

				# Parse antlr3 grammar files and generate C++ sources

				function(scylla_generate_antlr3)

				    set(one_value_args TARGET VAR IN_FILE OUT_DIR)

				    cmake_parse_arguments(args "" "${one_value_args}" "" ${ARGN})

				    get_filename_component(in_file_pure_name ${args_IN_FILE} NAME)

				    get_filename_component(stem ${in_file_pure_name} NAME_WE)

				    set(outputs

				        "${args_OUT_DIR}/${stem}Lexer.hpp"

				        "${args_OUT_DIR}/${stem}Lexer.cpp"

				        "${args_OUT_DIR}/${stem}Parser.hpp"

				        "${args_OUT_DIR}/${stem}Parser.cpp")

				    add_custom_command(

				        DEPENDS

				            ${args_IN_FILE}

				        OUTPUT ${outputs}

				        # Remove #ifdef'ed code from the grammar source code

				        COMMAND sed -e "/^#if 0/,/^#endif/d" "${args_IN_FILE}" > "${args_OUT_DIR}/${in_file_pure_name}"

				        COMMAND antlr3 "${args_OUT_DIR}/${in_file_pure_name}"

				        # We replace many local `ExceptionBaseType* ex` variables with a single function-scope one.

				        # Because we add such a variable to every function, and because `ExceptionBaseType` is not a global

				        # name, we also add a global typedef to avoid compilation errors.

				        COMMAND sed -i -e "/^.*On :.*$/d" "${args_OUT_DIR}/${stem}Lexer.hpp"

				        COMMAND sed -i -e "/^.*On :.*$/d" "${args_OUT_DIR}/${stem}Lexer.cpp"

				        COMMAND sed -i -e "/^.*On :.*$/d" "${args_OUT_DIR}/${stem}Parser.hpp"

				        COMMAND sed -i

				            -e "s/^\\( *\\)\\(ImplTraits::CommonTokenType\\* [a-zA-Z0-9_]* = NULL;\\)$/\\1const \\2/"

				            -e "/^.*On :.*$/d"

				            -e "1i using ExceptionBaseType = int;"

				            -e "s/^{/{ ExceptionBaseType\\* ex = nullptr;/; s/ExceptionBaseType\\* ex = new/ex = new/; s/exceptions::syntax_exception e/exceptions::syntax_exception\\& e/"

				            "${args_OUT_DIR}/${stem}Parser.cpp"

				        VERBATIM)

				    add_custom_target(${args_TARGET}

				        DEPENDS ${outputs})

				    set(${args_VAR} ${outputs} PARENT_SCOPE)

				endfunction()

				set(antlr3_grammar_files

				    cql3/Cql.g

				    alternator/expressions.g)

				set(antlr3_gen_files)

				foreach(f ${antlr3_grammar_files})

				    get_filename_component(grammar_file_name "${f}" NAME_WE)

				    get_filename_component(f_dir "${f}" DIRECTORY)

				    scylla_generate_antlr3(

				        TARGET scylla_antlr3_gen_${grammar_file_name}

				        VAR scylla_antlr3_gen_${grammar_file_name}_files

				        IN_FILE "${CMAKE_SOURCE_DIR}/${f}"

				        OUT_DIR ${scylla_gen_build_dir}/${f_dir})

				    list(APPEND antlr3_gen_files "${scylla_antlr3_gen_${grammar_file_name}_files}")

				endforeach()

				# Generate C++ sources from ragel grammar files

				seastar_generate_ragel(

				    TARGET scylla_ragel_gen_protocol_parser

				    VAR scylla_ragel_gen_protocol_parser_file

				    IN_FILE "${CMAKE_SOURCE_DIR}/redis/protocol_parser.rl"

				    OUT_FILE ${scylla_gen_build_dir}/redis/protocol_parser.hh)

				# Generate C++ sources from Swagger definitions

				set(swagger_files

				    api/api-doc/cache_service.json

				    api/api-doc/collectd.json

				    api/api-doc/column_family.json

				    api/api-doc/commitlog.json

				    api/api-doc/compaction_manager.json

				    api/api-doc/config.json

				    api/api-doc/endpoint_snitch_info.json

				    api/api-doc/error_injection.json

				    api/api-doc/failure_detector.json

				    api/api-doc/gossiper.json

				    api/api-doc/hinted_handoff.json

				    api/api-doc/lsa.json

				    api/api-doc/messaging_service.json

				    api/api-doc/storage_proxy.json

				    api/api-doc/storage_service.json

				    api/api-doc/stream_manager.json

				    api/api-doc/system.json

				    api/api-doc/task_manager.json

				    api/api-doc/task_manager_test.json

				    api/api-doc/utils.json)

				set(swagger_gen_files)

				foreach(f ${swagger_files})

				    get_filename_component(fname "${f}" NAME_WE)

				    get_filename_component(dir "${f}" DIRECTORY)

				    seastar_generate_swagger(

				        TARGET scylla_swagger_gen_${fname}

				        VAR scylla_swagger_gen_${fname}_files

				        IN_FILE "${CMAKE_SOURCE_DIR}/${f}"

				        OUT_DIR "${scylla_gen_build_dir}/${dir}")

				    list(APPEND swagger_gen_files "${scylla_swagger_gen_${fname}_files}")

				endforeach()

				# Create C++ bindings for IDL serializers

				function(scylla_generate_idl_serializer)

				    set(one_value_args TARGET VAR IN_FILE OUT_FILE)

				    cmake_parse_arguments(args "" "${one_value_args}" "" ${ARGN})

				    get_filename_component(out_dir ${args_OUT_FILE} DIRECTORY)

				    set(idl_compiler "${CMAKE_SOURCE_DIR}/idl-compiler.py")

				    find_package(Python3 COMPONENTS Interpreter)

				    add_custom_command(

				        DEPENDS

				            ${args_IN_FILE}

				            ${idl_compiler}

				        OUTPUT ${args_OUT_FILE}

				        COMMAND ${CMAKE_COMMAND} -E make_directory ${out_dir}

				        COMMAND Python3::Interpreter ${idl_compiler} --ns ser -f ${args_IN_FILE} -o ${args_OUT_FILE})

				    add_custom_target(${args_TARGET}

				        DEPENDS ${args_OUT_FILE})

				    set(${args_VAR} ${args_OUT_FILE} PARENT_SCOPE)

				endfunction()

				set(idl_serializers

				    idl/cache_temperature.idl.hh

				    idl/commitlog.idl.hh

				    idl/consistency_level.idl.hh

				    idl/frozen_mutation.idl.hh

				    idl/frozen_schema.idl.hh

				    idl/gossip_digest.idl.hh

				    idl/hinted_handoff.idl.hh

				    idl/idl_test.idl.hh

				    idl/keys.idl.hh

				    idl/messaging_service.idl.hh

				    idl/mutation.idl.hh

				    idl/paging_state.idl.hh

				    idl/partition_checksum.idl.hh

				    idl/paxos.idl.hh

				    idl/query.idl.hh

				    idl/raft.idl.hh

				    idl/range.idl.hh

				    idl/read_command.idl.hh

				    idl/reconcilable_result.idl.hh

				    idl/replay_position.idl.hh

				    idl/result.idl.hh

				    idl/ring_position.idl.hh

				    idl/streaming.idl.hh

				    idl/token.idl.hh

				    idl/tracing.idl.hh

				    idl/truncation_record.idl.hh

				    idl/uuid.idl.hh

				    idl/view.idl.hh)

				set(idl_gen_files)

				foreach(f ${idl_serializers})

				    get_filename_component(idl_name "${f}" NAME)

				    get_filename_component(idl_target "${idl_name}" NAME_WE)

				    get_filename_component(idl_dir "${f}" DIRECTORY)

				    string(REPLACE ".idl.hh" ".dist.hh" idl_out_hdr_name "${idl_name}")

				    scylla_generate_idl_serializer(

				        TARGET scylla_idl_gen_${idl_target}

				        VAR scylla_idl_gen_${idl_target}_files

				        IN_FILE "${CMAKE_SOURCE_DIR}/${f}"

				        OUT_FILE ${scylla_gen_build_dir}/${idl_dir}/${idl_out_hdr_name})

				    list(APPEND idl_gen_files "${scylla_idl_gen_${idl_target}_files}")

				endforeach()

				set(scylla_sources

				add_library(scylla-main STATIC)

				target_sources(scylla-main

				  PRIVATE

				    absl-flat_hash_map.cc

				    alternator/auth.cc

				    alternator/conditions.cc

				    alternator/controller.cc

				    alternator/executor.cc

				    alternator/expressions.cc

				    alternator/serialization.cc

				    alternator/server.cc

				    alternator/stats.cc

				    alternator/streams.cc

				    api/api.cc

				    api/cache_service.cc

				    api/collectd.cc

				    api/column_family.cc

				    api/commitlog.cc

				    api/compaction_manager.cc

				    api/config.cc

				    api/endpoint_snitch.cc

				    api/error_injection.cc

				    api/failure_detector.cc

				    api/gossiper.cc

				    api/hinted_handoff.cc

				    api/lsa.cc

				    api/messaging_service.cc

				    api/storage_proxy.cc

				    api/storage_service.cc

				    api/stream_manager.cc

				    api/system.cc

				    api/task_manager.cc

				    api/task_manager_test.cc

				    atomic_cell.cc

				    auth/allow_all_authenticator.cc

				    auth/allow_all_authorizer.cc

				    auth/authenticated_user.cc

				    auth/authentication_options.cc

				    auth/authenticator.cc

				    auth/common.cc

				    auth/default_authorizer.cc

				    auth/password_authenticator.cc

				    auth/passwords.cc

				    auth/permission.cc

				    auth/permissions_cache.cc

				    auth/resource.cc

				    auth/role_or_anonymous.cc

				    auth/roles-metadata.cc

				    auth/sasl_challenge.cc

				    auth/service.cc

				    auth/standard_role_manager.cc

				    auth/transitional.cc

				    bytes.cc

				    caching_options.cc

				    canonical_mutation.cc

				    cdc/cdc_partitioner.cc

				    cdc/generation.cc

				    cdc/log.cc

				    cdc/metadata.cc

				    cdc/split.cc

				    client_data.cc

				    clocks-impl.cc

				    collection_mutation.cc

				    compaction/compaction.cc

				    compaction/compaction_manager.cc

				    compaction/compaction_strategy.cc

				    compaction/leveled_compaction_strategy.cc

				    compaction/size_tiered_compaction_strategy.cc

				    compaction/time_window_compaction_strategy.cc

				    compress.cc

				    converting_mutation_partition_applier.cc

				    counters.cc

				    cql3/abstract_marker.cc

				    cql3/attributes.cc

				    cql3/cf_name.cc

				    cql3/column_condition.cc

				    cql3/column_identifier.cc

				    cql3/column_specification.cc

				    cql3/constants.cc

				    cql3/cql3_type.cc

				    cql3/expr/expression.cc

				    cql3/expr/prepare_expr.cc

				    cql3/expr/restrictions.cc

				    cql3/functions/aggregate_fcts.cc

				    cql3/functions/castas_fcts.cc

				    cql3/functions/error_injection_fcts.cc

				    cql3/functions/functions.cc

				    cql3/functions/user_function.cc

				    cql3/index_name.cc

				    cql3/keyspace_element_name.cc

				    cql3/lists.cc

				    cql3/maps.cc

				    cql3/operation.cc

				    cql3/prepare_context.cc

				    cql3/query_options.cc

				    cql3/query_processor.cc

				    cql3/restrictions/statement_restrictions.cc

				    cql3/result_set.cc

				    cql3/role_name.cc

				    cql3/selection/abstract_function_selector.cc

				    cql3/selection/selectable.cc

				    cql3/selection/selection.cc

				    cql3/selection/selector.cc

				    cql3/selection/selector_factories.cc

				    cql3/selection/simple_selector.cc

				    cql3/sets.cc

				    cql3/statements/alter_keyspace_statement.cc

				    cql3/statements/alter_service_level_statement.cc

				    cql3/statements/alter_table_statement.cc

				    cql3/statements/alter_type_statement.cc

				    cql3/statements/alter_view_statement.cc

				    cql3/statements/attach_service_level_statement.cc

				    cql3/statements/authentication_statement.cc

				    cql3/statements/authorization_statement.cc

				    cql3/statements/batch_statement.cc

				    cql3/statements/cas_request.cc

				    cql3/statements/cf_prop_defs.cc

				    cql3/statements/cf_statement.cc

				    cql3/statements/create_aggregate_statement.cc

				    cql3/statements/create_function_statement.cc

				    cql3/statements/create_index_statement.cc

				    cql3/statements/create_keyspace_statement.cc

				    cql3/statements/create_service_level_statement.cc

				    cql3/statements/create_table_statement.cc

				    cql3/statements/create_type_statement.cc

				    cql3/statements/create_view_statement.cc

				    cql3/statements/delete_statement.cc

				    cql3/statements/detach_service_level_statement.cc

				    cql3/statements/drop_aggregate_statement.cc

				    cql3/statements/drop_function_statement.cc

				    cql3/statements/drop_index_statement.cc

				    cql3/statements/drop_keyspace_statement.cc

				    cql3/statements/drop_service_level_statement.cc

				    cql3/statements/drop_table_statement.cc

				    cql3/statements/drop_type_statement.cc

				    cql3/statements/drop_view_statement.cc

				    cql3/statements/function_statement.cc

				    cql3/statements/grant_statement.cc

				    cql3/statements/index_prop_defs.cc

				    cql3/statements/index_target.cc

				    cql3/statements/ks_prop_defs.cc

				    cql3/statements/list_permissions_statement.cc

				    cql3/statements/list_service_level_attachments_statement.cc

				    cql3/statements/list_service_level_statement.cc

				    cql3/statements/list_users_statement.cc

				    cql3/statements/modification_statement.cc

				    cql3/statements/permission_altering_statement.cc

				    cql3/statements/property_definitions.cc

				    cql3/statements/raw/parsed_statement.cc

				    cql3/statements/revoke_statement.cc

				    cql3/statements/role-management-statements.cc

				    cql3/statements/schema_altering_statement.cc

				    cql3/statements/select_statement.cc

				    cql3/statements/service_level_statement.cc

				    cql3/statements/sl_prop_defs.cc

				    cql3/statements/truncate_statement.cc

				    cql3/statements/update_statement.cc

				    cql3/statements/strongly_consistent_modification_statement.cc

				    cql3/statements/strongly_consistent_select_statement.cc

				    cql3/statements/use_statement.cc

				    cql3/type_json.cc

				    cql3/untyped_result_set.cc

				    cql3/update_parameters.cc

				    cql3/user_types.cc

				    cql3/util.cc

				    cql3/ut_name.cc

				    cql3/values.cc

				    data_dictionary/data_dictionary.cc

				    db/batchlog_manager.cc

				    db/commitlog/commitlog.cc

				    db/commitlog/commitlog_entry.cc

				    db/commitlog/commitlog_replayer.cc

				    db/config.cc

				    db/consistency_level.cc

				    db/cql_type_parser.cc

				    db/data_listeners.cc

				    db/extensions.cc

				    db/heat_load_balance.cc

				    db/hints/host_filter.cc

				    db/hints/manager.cc

				    db/hints/resource_manager.cc

				    db/hints/sync_point.cc

				    db/large_data_handler.cc

				    db/legacy_schema_migrator.cc

				    db/marshal/type_parser.cc

				    db/rate_limiter.cc

				    db/schema_tables.cc

				    db/size_estimates_virtual_reader.cc

				    db/snapshot-ctl.cc

				    db/sstables-format-selector.cc

				    db/system_distributed_keyspace.cc

				    db/system_keyspace.cc

				    db/view/row_locking.cc

				    db/view/view.cc

				    db/view/view_update_generator.cc

				    db/virtual_table.cc

				    dht/boot_strapper.cc

				    dht/i_partitioner.cc

				    dht/murmur3_partitioner.cc

				    dht/range_streamer.cc

				    dht/token.cc

				    replica/distributed_loader.cc

				    direct_failure_detector/failure_detector.cc

				    duration.cc

				    exceptions/exceptions.cc

				    readers/mutation_readers.cc

				    frozen_mutation.cc

				    frozen_schema.cc

				    generic_server.cc

				    gms/application_state.cc

				    gms/endpoint_state.cc

				    gms/failure_detector.cc

				    gms/feature_service.cc

				    gms/gossip_digest_ack2.cc

				    gms/gossip_digest_ack.cc

				    gms/gossip_digest_syn.cc

				    gms/gossiper.cc

				    gms/inet_address.cc

				    gms/versioned_value.cc

				    gms/version_generator.cc

				    hashers.cc

				    index/secondary_index.cc

				    index/secondary_index_manager.cc

				    debug.cc

				    init.cc

				    keys.cc

				    utils/lister.cc

				    locator/abstract_replication_strategy.cc

				    locator/azure_snitch.cc

				    locator/ec2_multi_region_snitch.cc

				    locator/ec2_snitch.cc

				    locator/everywhere_replication_strategy.cc

				    locator/gce_snitch.cc

				    locator/gossiping_property_file_snitch.cc

				    locator/local_strategy.cc

				    locator/network_topology_strategy.cc

				    locator/production_snitch_base.cc

				    locator/rack_inferring_snitch.cc

				    locator/simple_snitch.cc

				    locator/simple_strategy.cc

				    locator/snitch_base.cc

				    locator/token_metadata.cc

				    lang/lua.cc

				    main.cc

				    replica/memtable.cc

				    message/messaging_service.cc

				    multishard_mutation_query.cc

				    mutation.cc

				    mutation_fragment.cc

				    mutation_partition.cc

				    mutation_partition_serializer.cc

				    mutation_partition_view.cc

				    mutation_query.cc

				    readers/mutation_reader.cc

				    mutation_writer/feed_writers.cc

				    mutation_writer/multishard_writer.cc

				    mutation_writer/partition_based_splitting_writer.cc

				    mutation_writer/shard_based_splitting_writer.cc

				    mutation_writer/timestamp_based_splitting_writer.cc

				    partition_slice_builder.cc

				    partition_version.cc

				    querier.cc

				    query.cc

				    query_ranges_to_vnodes.cc

				    query-result-set.cc

				    raft/fsm.cc

				    raft/log.cc

				    raft/raft.cc

				    raft/server.cc

				    raft/tracker.cc

				    service/broadcast_tables/experimental/lang.cc

				    range_tombstone.cc

				    range_tombstone_list.cc

				    tombstone_gc_options.cc

				    tombstone_gc.cc

				    reader_concurrency_semaphore.cc

				    redis/abstract_command.cc

				    redis/command_factory.cc

				    redis/commands.cc

				    redis/keyspace_utils.cc

				    redis/lolwut.cc

				    redis/mutation_utils.cc

				    redis/options.cc

				    redis/query_processor.cc

				    redis/query_utils.cc

				    redis/server.cc

				    redis/service.cc

				    redis/stats.cc

				    release.cc

				    repair/repair.cc

				    repair/row_level.cc

				    replica/database.cc

				    replica/table.cc

				    row_cache.cc

				    schema.cc

				    schema_mutations.cc

				    schema_registry.cc

				    serializer.cc

				    service/client_state.cc

				    service/forward_service.cc

				    service/migration_manager.cc

				    service/misc_services.cc

				    service/pager/paging_state.cc

				    service/pager/query_pagers.cc

				    service/paxos/paxos_state.cc

				    service/paxos/prepare_response.cc

				    service/paxos/prepare_summary.cc

				    service/paxos/proposal.cc

				    service/priority_manager.cc

				    service/qos/qos_common.cc

				    service/qos/service_level_controller.cc

				    service/qos/standard_service_level_distributed_data_accessor.cc

				    service/raft/raft_group_registry.cc

				    service/raft/raft_rpc.cc

				    service/raft/raft_sys_table_storage.cc

				    service/raft/group0_state_machine.cc

				    service/storage_proxy.cc

				    service/storage_service.cc

				    sstables/compress.cc

				    sstables/integrity_checked_file_impl.cc

				    sstables/kl/reader.cc

				    sstables/metadata_collector.cc

				    sstables/m_format_read_helpers.cc

				    sstables/mx/reader.cc

				    sstables/mx/writer.cc

				    sstables/prepended_input_stream.cc

				    sstables/random_access_reader.cc

				    sstables/sstable_directory.cc

				    sstables/sstable_mutation_reader.cc

				    sstables/sstables.cc

				    sstables/sstable_set.cc

				    sstables/sstables_manager.cc

				    sstables/sstable_version.cc

				    sstables/writer.cc

				    streaming/consumer.cc

				    streaming/progress_info.cc

				    streaming/session_info.cc

				    streaming/stream_coordinator.cc

				    streaming/stream_manager.cc

				    streaming/stream_plan.cc

				    streaming/stream_reason.cc

				    streaming/stream_receive_task.cc

				    streaming/stream_request.cc

				    streaming/stream_result_future.cc

				    streaming/stream_session.cc

				    streaming/stream_session_state.cc

				    streaming/stream_summary.cc

				    streaming/stream_task.cc

				    streaming/stream_transfer_task.cc

				    sstables_loader.cc

				    table_helper.cc

				    tasks/task_manager.cc

				    thrift/controller.cc

				    thrift/handler.cc

				    thrift/server.cc

				    thrift/thrift_validation.cc

				    timeout_config.cc

				    tools/scylla-sstable-index.cc

				    tools/scylla-types.cc

				    tracing/traced_file.cc

				    tracing/trace_keyspace_helper.cc

				    tracing/trace_state.cc

				    tracing/tracing_backend_registry.cc

				    tracing/tracing.cc

				    transport/controller.cc

				    transport/cql_protocol_extension.cc

				    transport/event.cc

				    transport/event_notifier.cc

				    transport/messages/result_message.cc

				    transport/server.cc

				    types.cc

				    unimplemented.cc

				    utils/arch/powerpc/crc32-vpmsum/crc32_wrapper.cc

				    utils/array-search.cc

				    utils/ascii.cc

				    utils/base64.cc

				    utils/big_decimal.cc

				    utils/bloom_calculations.cc

				    utils/bloom_filter.cc

				    utils/buffer_input_stream.cc

				    utils/build_id.cc

				    utils/config_file.cc

				    utils/directories.cc

				    utils/disk-error-handler.cc

				    utils/dynamic_bitset.cc

				    utils/error_injection.cc

				    utils/exceptions.cc

				    utils/file_lock.cc

				    utils/generation-number.cc

				    utils/gz/crc_combine.cc

				    utils/gz/gen_crc_combine_table.cc

				    utils/human_readable.cc

				    utils/i_filter.cc

				    utils/large_bitset.cc

				    utils/like_matcher.cc

				    utils/limiting_data_source.cc

				    utils/logalloc.cc

				    utils/managed_bytes.cc

				    utils/multiprecision_int.cc

				    utils/murmur_hash.cc

				    utils/rate_limiter.cc

				    utils/rjson.cc

				    utils/runtime.cc

				    utils/updateable_value.cc

				    utils/utf8.cc

				    utils/uuid.cc

				    utils/UUID_gen.cc

				    validation.cc

				    vint-serialization.cc

				    zstd.cc)

				set(scylla_gen_sources

				    "${scylla_thrift_gen_cassandra_files}"

				    "${scylla_ragel_gen_protocol_parser_file}"

				    "${swagger_gen_files}"

				    "${idl_gen_files}"

				    "${antlr3_gen_files}")

				target_link_libraries(scylla-main

				  PRIVATE

				    db

				    absl::hash

				    absl::raw_hash_set

				    Seastar::seastar

				    Snappy::snappy

				    systemd

				    ZLIB::ZLIB)

				add_subdirectory(api)

				add_subdirectory(alternator)

				add_subdirectory(db)

				add_subdirectory(auth)

				add_subdirectory(cdc)

				add_subdirectory(compaction)

				add_subdirectory(cql3)

				add_subdirectory(data_dictionary)

				add_subdirectory(dht)

				add_subdirectory(gms)

				add_subdirectory(idl)

				add_subdirectory(index)

				add_subdirectory(interface)

				add_subdirectory(lang)

				add_subdirectory(locator)

				add_subdirectory(mutation)

				add_subdirectory(mutation_writer)

				add_subdirectory(readers)

				add_subdirectory(redis)

				add_subdirectory(replica)

				add_subdirectory(raft)

				add_subdirectory(repair)

				add_subdirectory(rust)

				add_subdirectory(schema)

				add_subdirectory(service)

				add_subdirectory(sstables)

				add_subdirectory(streaming)

				add_subdirectory(test)

				add_subdirectory(thrift)

				add_subdirectory(tools)

				add_subdirectory(tracing)

				add_subdirectory(transport)

				add_subdirectory(types)

				add_subdirectory(utils)

				include(add_version_library)

				add_version_library(scylla_version

				    release.cc)

				add_executable(scylla

				    ${scylla_sources}

				    ${scylla_gen_sources})

				  main.cc)

				target_link_libraries(scylla PRIVATE

				    scylla-main

				    api

				    auth

				    alternator

				    db

				    cdc

				    compaction

				    cql3

				    data_dictionary

				    dht

				    gms

				    idl

				    index

				    lang

				    locator

				    mutation

				    mutation_writer

				    raft

				    readers

				    redis

				    repair

				    replica

				    schema

				    scylla_version

				    service

				    sstables

				    streaming

				    test-perf

				    thrift

				    tools

				    tracing

				    transport

				    types

				    utils)

				target_link_libraries(Boost::regex

				  INTERFACE

				    ICU::i18n

				    ICU::uc)

				target_link_libraries(scylla PRIVATE

				    seastar

				    # Boost dependencies

				    Boost::filesystem

				    Boost::program_options

				    Boost::system

				    Boost::thread

				    Boost::regex

				    Boost::headers

				    # Abseil libs

				    absl::hashtablez_sampler

				    absl::raw_hash_set

				    absl::synchronization

				    absl::graphcycles_internal

				    absl::stacktrace

				    absl::symbolize

				    absl::debugging_internal

				    absl::demangle_internal

				    absl::time

				    absl::time_zone

				    absl::int128

				    absl::city

				    absl::hash

				    absl::malloc_internal

				    absl::spinlock_wait

				    absl::base

				    absl::dynamic_annotations

				    absl::raw_logging_internal

				    absl::exponential_biased

				    absl::throw_delegate

				    # System libs

				    ZLIB::ZLIB

				    ICU::uc

				    systemd

				    zstd

				    snappy

				    ${LUA_LIBRARIES}

				    thrift

				    crypt)

				    Boost::program_options)

				# Force SHA1 build-id generation

				set(default_linker_flags "-Wl,--build-id=sha1")

				include(CheckLinkerFlag)

				foreach(linker "lld" "gold")

				    set(linker_flag "-fuse-ld=${linker}")

				    check_linker_flag(CXX ${linker_flag} "CXX_LINKER_HAVE_${linker}")

				    if(CXX_LINKER_HAVE_${linker})

				        string(APPEND default_linker_flags " ${linker_flag}")

				        break()

				    endif()

				endforeach()

				set(CMAKE_EXE_LINKER_FLAGS "${default_linker_flags}" CACHE INTERNAL "")

				target_link_libraries(scylla PRIVATE

				    -Wl,--build-id=sha1 # Force SHA1 build-id generation

				    # TODO: Use lld linker if it's available, otherwise gold, else bfd

				    -fuse-ld=lld)

				# TODO: patch dynamic linker to match configure.py behavior

				target_compile_options(scylla PRIVATE

				    -std=gnu++20

				    ${cxx_coro_flag}

				    ${target_arch_flag})

				# Hacks needed to expose internal APIs for xxhash dependencies

				target_compile_definitions(scylla PRIVATE XXH_PRIVATE_API HAVE_LZ4_COMPRESS_DEFAULT)

				target_include_directories(scylla PRIVATE

				    "${CMAKE_CURRENT_SOURCE_DIR}"

				    libdeflate

				    abseil

				    "${scylla_gen_build_dir}")

				###

				### Create crc_combine_table helper executable.

				### Use it to generate crc_combine_table.cc to be used in scylla at build time.

				###

				add_executable(crc_combine_table utils/gz/gen_crc_combine_table.cc)

				target_link_libraries(crc_combine_table PRIVATE seastar)

				target_include_directories(crc_combine_table PRIVATE "${CMAKE_CURRENT_SOURCE_DIR}")

				target_compile_options(crc_combine_table PRIVATE

				    -std=gnu++20

				    ${cxx_coro_flag}

				    ${target_arch_flag})

				add_dependencies(scylla crc_combine_table)

				# Generate an additional source file at build time that is needed for Scylla compilation

				add_custom_command(OUTPUT "${scylla_gen_build_dir}/utils/gz/crc_combine_table.cc"

				    COMMAND $<TARGET_FILE:crc_combine_table> > "${scylla_gen_build_dir}/utils/gz/crc_combine_table.cc"

				    DEPENDS crc_combine_table)

				target_sources(scylla PRIVATE "${scylla_gen_build_dir}/utils/gz/crc_combine_table.cc")

				###

				### Generate version file and supply appropriate compile definitions for release.cc

				###

				execute_process(COMMAND ${CMAKE_SOURCE_DIR}/SCYLLA-VERSION-GEN --output-dir "${CMAKE_BINARY_DIR}/gen" RESULT_VARIABLE scylla_version_gen_res)

				if(scylla_version_gen_res)

				    message(SEND_ERROR "Version file generation failed. Return code: ${scylla_version_gen_res}")

				endif()

				file(READ "${CMAKE_BINARY_DIR}/gen/SCYLLA-VERSION-FILE" scylla_version)

				string(STRIP "${scylla_version}" scylla_version)

				file(READ "${CMAKE_BINARY_DIR}/gen/SCYLLA-RELEASE-FILE" scylla_release)

				string(STRIP "${scylla_release}" scylla_release)

				get_property(release_cdefs SOURCE "${CMAKE_SOURCE_DIR}/release.cc" PROPERTY COMPILE_DEFINITIONS)

				list(APPEND release_cdefs "SCYLLA_VERSION=\"${scylla_version}\"" "SCYLLA_RELEASE=\"${scylla_release}\"")

				set_source_files_properties("${CMAKE_SOURCE_DIR}/release.cc" PROPERTIES COMPILE_DEFINITIONS "${release_cdefs}")

				###

				### Custom command for building libdeflate. Link the library to scylla.

				###

				set(libdeflate_lib "${scylla_build_dir}/libdeflate/libdeflate.a")

				add_custom_command(OUTPUT "${libdeflate_lib}"

				    COMMAND make -C "${CMAKE_SOURCE_DIR}/libdeflate"

				        BUILD_DIR=../build/${BUILD_TYPE}/libdeflate/

				        CC=${CMAKE_C_COMPILER}

				        "CFLAGS=${target_arch_flag}"

				        ../build/${BUILD_TYPE}/libdeflate//libdeflate.a) # Two backslashes are important!

				# Hack to force generating custom command to produce libdeflate.a

				add_custom_target(libdeflate DEPENDS "${libdeflate_lib}")

				target_link_libraries(scylla PRIVATE "${libdeflate_lib}")

				# TODO: create cmake/ directory and move utilities (generate functions etc) there

				# TODO: Build tests if BUILD_TESTING=on (using CTest module)

									
										2

CONTRIBUTING.md
									
												View File
												
				@@ -2,7 +2,7 @@

				## Asking questions or requesting help

				Use the [Scylla Users mailing list](https://groups.google.com/g/scylladb-users) or the [Slack workspace](http://slack.scylladb.com) for general questions and help.

				Use the [ScyllaDB Community Forum](https://forum.scylladb.com) or the [Slack workspace](http://slack.scylladb.com) for general questions and help.

				Join the [Scylla Developers mailing list](https://groups.google.com/g/scylladb-dev) for deeper technical discussions and to discuss your ideas for contributions.

									
										2

HACKING.md
									
												View File
												
				@@ -195,7 +195,7 @@ $ # Edit configuration options as appropriate

				$ SCYLLA_HOME=$HOME/scylla build/release/scylla

				```

				The `scylla.yaml` file in the repository by default writes all database data to `/var/lib/scylla`, which likely requires root access. Change the `data_file_directories` and `commitlog_directory` fields as appropriate.

				The `scylla.yaml` file in the repository by default writes all database data to `/var/lib/scylla`, which likely requires root access. Change the `data_file_directories`, `commitlog_directory` and `schema_commitlog_directory` fields as appropriate.

				Scylla has a number of requirements for the file-system and operating system to operate ideally and at peak performance. However, during development, these requirements can be relaxed with the `--developer-mode` flag.

									
										12

README.md
									
												View File
												
				@@ -30,9 +30,9 @@ requirements - you just need to meet the frozen toolchain's prerequisites

				Building Scylla with the frozen toolchain `dbuild` is as easy as:

				```bash

				$ git submodule update --init --force --recursive

				$ ./tools/toolchain/dbuild ./configure.py

				$ ./tools/toolchain/dbuild ninja build/release/scylla

				$ git submodule update --init --force --recursive

				$ ./tools/toolchain/dbuild ./configure.py

				$ ./tools/toolchain/dbuild ninja build/release/scylla

				```

				For further information, please see:

				@@ -60,7 +60,7 @@ Please note that you need to run Scylla with `dbuild` if you built it with the f

				For more run options, run:

				```bash

				$ ./tools/toolchain/dbuild ./build/release/scylla --help

				$ ./tools/toolchain/dbuild ./build/release/scylla --help

				```

				## Testing

				@@ -100,10 +100,10 @@ If you are a developer working on Scylla, please read the [developer guidelines]

				## Contact

				* The [users mailing list] and [Slack channel] are for users to discuss configuration, management, and operations of the ScyllaDB open source.

				* The [community forum] and [Slack channel] are for users to discuss configuration, management, and operations of the ScyllaDB open source.

				* The [developers mailing list] is for developers and people interested in following the development of ScyllaDB to discuss technical topics.

				[Users mailing list]: https://groups.google.com/forum/#!forum/scylladb-users

				[Community forum]: https://forum.scylladb.com/

				[Slack channel]: http://slack.scylladb.com/

4

SCYLLA-VERSION-GEN

View File

@@ -34,7 +34,7 @@ END
 DATE=""
 while [[ $# -gt 0 ]]; do
 while [ $# -gt 0 ]; do
 	opt="$1"
 	case $opt in
 		-h|--help)
@@ -72,7 +72,7 @@ fi
 # Default scylla product/version tags
 PRODUCT=scylla
 VERSION=5.2.0-dev
 VERSION=5.3.0-dev
 if test -f version
 then

1

abseil

Submodule abseil deleted from 7f3c0d7811

									
										30

alternator/CMakeLists.txt
									
										Normal file
									
												View File
												
				@@ -0,0 +1,30 @@

				include(generate_cql_grammar)

				generate_cql_grammar(

				  GRAMMAR expressions.g

				  SOURCES cql_grammar_srcs)

				add_library(alternator STATIC)

				target_sources(alternator

				  PRIVATE

				    controller.cc

				    server.cc

				    executor.cc

				    stats.cc

				    serialization.cc

				    expressions.cc

				    conditions.cc

				    auth.cc

				    streams.cc

				    ttl.cc

				    ${cql_grammar_srcs})

				target_include_directories(alternator

				  PUBLIC

				    ${CMAKE_SOURCE_DIR}

				    ${CMAKE_BINARY_DIR}

				  PRIVATE

				    ${RAPIDJSON_INCLUDE_DIRS})

				target_link_libraries(alternator

				  cql3

				  idl

				  Seastar::seastar

				  xxHash::xxhash)

									
										97

alternator/auth.cc
									
												View File
												
				@@ -10,8 +10,6 @@

				#include "log.hh"

				#include <string>

				#include <string_view>

				#include <gnutls/crypto.h>

				#include "hashers.hh"

				#include "bytes.hh"

				#include "alternator/auth.hh"

				#include <fmt/format.h>

				@@ -29,99 +27,6 @@ namespace alternator {

				static logging::logger alogger("alternator-auth");

				static hmac_sha256_digest hmac_sha256(std::string_view key, std::string_view msg) {

				    hmac_sha256_digest digest;

				    int ret = gnutls_hmac_fast(GNUTLS_MAC_SHA256, key.data(), key.size(), msg.data(), msg.size(), digest.data());

				    if (ret) {

				        throw std::runtime_error(fmt::format("Computing HMAC failed ({}): {}", ret, gnutls_strerror(ret)));

				    }

				    return digest;

				}

				static hmac_sha256_digest get_signature_key(std::string_view key, std::string_view date_stamp, std::string_view region_name, std::string_view service_name) {

				    auto date = hmac_sha256("AWS4" + std::string(key), date_stamp);

				    auto region = hmac_sha256(std::string_view(date.data(), date.size()), region_name);

				    auto service = hmac_sha256(std::string_view(region.data(), region.size()), service_name);

				    auto signing = hmac_sha256(std::string_view(service.data(), service.size()), "aws4_request");

				    return signing;

				}

				static std::string apply_sha256(std::string_view msg) {

				    sha256_hasher hasher;

				    hasher.update(msg.data(), msg.size());

				    return to_hex(hasher.finalize());

				}

				static std::string apply_sha256(const std::vector<temporary_buffer<char>>& msg) {

				    sha256_hasher hasher;

				    for (const temporary_buffer<char>& buf : msg) {

				        hasher.update(buf.get(), buf.size());

				    }

				    return to_hex(hasher.finalize());

				}

				static std::string format_time_point(db_clock::time_point tp) {

				    time_t time_point_repr = db_clock::to_time_t(tp);

				    std::string time_point_str;

				    time_point_str.resize(17);

				    ::tm time_buf;

				    // strftime prints the terminating null character as well

				    std::strftime(time_point_str.data(), time_point_str.size(), "%Y%m%dT%H%M%SZ", ::gmtime_r(&time_point_repr, &time_buf));

				    time_point_str.resize(16);

				    return time_point_str;

				}

				void check_expiry(std::string_view signature_date) {

				    //FIXME: The default 15min can be changed with X-Amz-Expires header - we should honor it

				    std::string expiration_str = format_time_point(db_clock::now() - 15min);

				    std::string validity_str = format_time_point(db_clock::now() + 15min);

				    if (signature_date < expiration_str) {

				        throw api_error::invalid_signature(

				                fmt::format("Signature expired: {} is now earlier than {} (current time - 15 min.)",

				                signature_date, expiration_str));

				    }

				    if (signature_date > validity_str) {

				        throw api_error::invalid_signature(

				                fmt::format("Signature not yet current: {} is still later than {} (current time + 15 min.)",

				                signature_date, validity_str));

				    }

				}

				std::string get_signature(std::string_view access_key_id, std::string_view secret_access_key, std::string_view host, std::string_view method,

				        std::string_view orig_datestamp, std::string_view signed_headers_str, const std::map<std::string_view, std::string_view>& signed_headers_map,

				        const std::vector<temporary_buffer<char>>& body_content, std::string_view region, std::string_view service, std::string_view query_string) {

				    auto amz_date_it = signed_headers_map.find("x-amz-date");

				    if (amz_date_it == signed_headers_map.end()) {

				        throw api_error::invalid_signature("X-Amz-Date header is mandatory for signature verification");

				    }

				    std::string_view amz_date = amz_date_it->second;

				    check_expiry(amz_date);

				    std::string_view datestamp = amz_date.substr(0, 8);

				    if (datestamp != orig_datestamp) {

				        throw api_error::invalid_signature(

				                format("X-Amz-Date date does not match the provided datestamp. Expected {}, got {}",

				                        orig_datestamp, datestamp));

				    }

				    std::string_view canonical_uri = "/";

				    std::stringstream canonical_headers;

				    for (const auto& header : signed_headers_map) {

				        canonical_headers << fmt::format("{}:{}", header.first, header.second) << '\n';

				    }

				    std::string payload_hash = apply_sha256(body_content);

				    std::string canonical_request = fmt::format("{}\n{}\n{}\n{}\n{}\n{}", method, canonical_uri, query_string, canonical_headers.str(), signed_headers_str, payload_hash);

				    std::string_view algorithm = "AWS4-HMAC-SHA256";

				    std::string credential_scope = fmt::format("{}/{}/{}/aws4_request", datestamp, region, service);

				    std::string string_to_sign = fmt::format("{}\n{}\n{}\n{}", algorithm, amz_date, credential_scope,  apply_sha256(canonical_request));

				    hmac_sha256_digest signing_key = get_signature_key(secret_access_key, datestamp, region, service);

				    hmac_sha256_digest signature = hmac_sha256(std::string_view(signing_key.data(), signing_key.size()), string_to_sign);

				    return to_hex(bytes_view(reinterpret_cast<const int8_t*>(signature.data()), signature.size()));

				}

				future<std::string> get_key_from_roles(service::storage_proxy& proxy, std::string username) {

				    schema_ptr schema = proxy.data_dictionary().find_schema("system_auth", "roles");

				    partition_key pk = partition_key::from_single_value(*schema, utf8_type->decompose(username));

				@@ -141,7 +46,7 @@ future<std::string> get_key_from_roles(service::storage_proxy& proxy, std::strin

				    service::storage_proxy::coordinator_query_result qr = co_await proxy.query(schema, std::move(command), std::move(partition_ranges), cl,

				            service::storage_proxy::coordinator_query_options(executor::default_timeout(), empty_service_permit(), client_state));

				    cql3::selection::result_set_builder builder(*selection, gc_clock::now(), cql_serialization_format::latest());

				    cql3::selection::result_set_builder builder(*selection, gc_clock::now());

				    query::result_view::consume(*qr.query_result, partition_slice, cql3::selection::result_set_builder::visitor(builder, *schema, *selection));

				    auto result_set = builder.build();

									
										6

alternator/auth.hh
									
												View File
												
				@@ -20,14 +20,8 @@ class storage_proxy;

				namespace alternator {

				using hmac_sha256_digest = std::array<char, 32>;

				using key_cache = utils::loading_cache<std::string, std::string, 1>;

				std::string get_signature(std::string_view access_key_id, std::string_view secret_access_key, std::string_view host, std::string_view method,

				        std::string_view orig_datestamp, std::string_view signed_headers_str, const std::map<std::string_view, std::string_view>& signed_headers_map,

				        const std::vector<temporary_buffer<char>>& body_content, std::string_view region, std::string_view service, std::string_view query_string);

				future<std::string> get_key_from_roles(service::storage_proxy& proxy, std::string username);

				}

									
										41

alternator/conditions.cc
									
												View File
												
				@@ -232,7 +232,14 @@ bool check_BEGINS_WITH(const rjson::value* v1, const rjson::value& v2,

				    if (it2->name == "S") {

				        return rjson::to_string_view(it1->value).starts_with(rjson::to_string_view(it2->value));

				    } else /* it2->name == "B" */ {

				        return base64_begins_with(rjson::to_string_view(it1->value), rjson::to_string_view(it2->value));

				        try {

				            return base64_begins_with(rjson::to_string_view(it1->value), rjson::to_string_view(it2->value));

				        } catch(std::invalid_argument&) {

				            // determine if any of the malformed values is from query and raise an exception if so

				            unwrap_bytes(it1->value, v1_from_query);

				            unwrap_bytes(it2->value, v2_from_query);

				            return false;

				        }

				    }

				}

				@@ -241,7 +248,7 @@ static bool is_set_of(const rjson::value& type1, const rjson::value& type2) {

				}

				// Check if two JSON-encoded values match with the CONTAINS relation

				bool check_CONTAINS(const rjson::value* v1, const rjson::value& v2) {

				bool check_CONTAINS(const rjson::value* v1, const rjson::value& v2, bool v1_from_query, bool v2_from_query) {

				    if (!v1) {

				        return false;

				    }

				@@ -250,7 +257,12 @@ bool check_CONTAINS(const rjson::value* v1, const rjson::value& v2) {

				    if (kv1.name == "S" && kv2.name == "S") {

				        return rjson::to_string_view(kv1.value).find(rjson::to_string_view(kv2.value)) != std::string_view::npos;

				    } else if (kv1.name == "B" && kv2.name == "B") {

				        return rjson::base64_decode(kv1.value).find(rjson::base64_decode(kv2.value)) != bytes::npos;

				        auto d_kv1 = unwrap_bytes(kv1.value, v1_from_query);

				        auto d_kv2 = unwrap_bytes(kv2.value, v2_from_query);

				        if (!d_kv1 || !d_kv2) {

				            return false;

				        }

				        return d_kv1->find(*d_kv2) != bytes::npos;

				    } else if (is_set_of(kv1.name, kv2.name)) {

				        for (auto i = kv1.value.Begin(); i != kv1.value.End(); ++i) {

				            if (*i == kv2.value) {

				@@ -273,11 +285,11 @@ bool check_CONTAINS(const rjson::value* v1, const rjson::value& v2) {

				}

				// Check if two JSON-encoded values match with the NOT_CONTAINS relation

				static bool check_NOT_CONTAINS(const rjson::value* v1, const rjson::value& v2) {

				static bool check_NOT_CONTAINS(const rjson::value* v1, const rjson::value& v2, bool v1_from_query, bool v2_from_query) {

				    if (!v1) {

				        return false;

				    }

				    return !check_CONTAINS(v1, v2);

				    return !check_CONTAINS(v1, v2, v1_from_query, v2_from_query);

				}

				// Check if a JSON-encoded value equals any element of an array, which must have at least one element.

				@@ -374,7 +386,12 @@ bool check_compare(const rjson::value* v1, const rjson::value& v2, const Compara

				                   std::string_view(kv2.value.GetString(), kv2.value.GetStringLength()));

				    }

				    if (kv1.name == "B") {

				        return cmp(rjson::base64_decode(kv1.value), rjson::base64_decode(kv2.value));

				        auto d_kv1 = unwrap_bytes(kv1.value, v1_from_query);

				        auto d_kv2 = unwrap_bytes(kv2.value, v2_from_query);

				        if(!d_kv1 || !d_kv2) {

				            return false;

				        }

				        return cmp(*d_kv1, *d_kv2);

				    }

				    // cannot reach here, as check_comparable_type() verifies the type is one

				    // of the above options.

				@@ -464,7 +481,13 @@ static bool check_BETWEEN(const rjson::value* v, const rjson::value& lb, const r

				                             bounds_from_query);

				    }

				    if (kv_v.name == "B") {

				        return check_BETWEEN(rjson::base64_decode(kv_v.value), rjson::base64_decode(kv_lb.value), rjson::base64_decode(kv_ub.value), bounds_from_query);

				        auto d_kv_v = unwrap_bytes(kv_v.value, v_from_query);

				        auto d_kv_lb = unwrap_bytes(kv_lb.value, lb_from_query);

				        auto d_kv_ub = unwrap_bytes(kv_ub.value, ub_from_query);

				        if(!d_kv_v || !d_kv_lb || !d_kv_ub) {

				            return false;

				        }

				        return check_BETWEEN(*d_kv_v, *d_kv_lb, *d_kv_ub, bounds_from_query);

				    }

				    if (v_from_query) {

				        throw api_error::validation(

				@@ -557,7 +580,7 @@ static bool verify_expected_one(const rjson::value& condition, const rjson::valu

				                            format("CONTAINS operator requires a single AttributeValue of type String, Number, or Binary, "

				                                    "got {} instead", argtype));

				                }

				                return check_CONTAINS(got, arg);

				                return check_CONTAINS(got, arg, false, true);

				            }

				        case comparison_operator_type::NOT_CONTAINS:

				            {

				@@ -571,7 +594,7 @@ static bool verify_expected_one(const rjson::value& condition, const rjson::valu

				                            format("CONTAINS operator requires a single AttributeValue of type String, Number, or Binary, "

				                                    "got {} instead", argtype));

				                }

				                return check_NOT_CONTAINS(got, arg);

				                return check_NOT_CONTAINS(got, arg, false, true);

				            }

				        }

				        throw std::logic_error(format("Internal error: corrupted operator enum: {}", int(op)));

									
										2

alternator/conditions.hh
									
												View File
												
				@@ -38,7 +38,7 @@ conditional_operator_type get_conditional_operator(const rjson::value& req);

				bool verify_expected(const rjson::value& req, const rjson::value* previous_item);

				bool verify_condition(const rjson::value& condition, bool require_all, const rjson::value* previous_item);

				bool check_CONTAINS(const rjson::value* v1, const rjson::value& v2);

				bool check_CONTAINS(const rjson::value* v1, const rjson::value& v2, bool v1_from_query, bool v2_from_query);

				bool check_BEGINS_WITH(const rjson::value* v1, const rjson::value& v2, bool v1_from_query, bool v2_from_query);

				bool verify_condition_expression(

									
										4

alternator/error.hh
									
												View File
												
				@@ -23,7 +23,7 @@ namespace alternator {

				// api_error into a JSON object, and that is returned to the user.

				class api_error final : public std::exception {

				public:

				    using status_type = httpd::reply::status_type;

				    using status_type = http::reply::status_type;

				    status_type _http_code;

				    std::string _type;

				    std::string _msg;

				@@ -77,7 +77,7 @@ public:

				        return api_error("TableNotFoundException", std::move(msg));

				    }

				    static api_error internal(std::string msg) {

				        return api_error("InternalServerError", std::move(msg), reply::status_type::internal_server_error);

				        return api_error("InternalServerError", std::move(msg), http::reply::status_type::internal_server_error);

				    }

				    // Provide the "std::exception" interface, to make it easier to print this

									
										89

alternator/executor.cc
									
												View File
												
				@@ -13,12 +13,12 @@

				#include <seastar/core/sleep.hh>

				#include "alternator/executor.hh"

				#include "log.hh"

				#include "schema_builder.hh"

				#include "schema/schema_builder.hh"

				#include "data_dictionary/keyspace_metadata.hh"

				#include "exceptions/exceptions.hh"

				#include "timestamp.hh"

				#include "types/map.hh"

				#include "schema.hh"

				#include "schema/schema.hh"

				#include "query-request.hh"

				#include "query-result-reader.hh"

				#include "cql3/selection/selection.hh"

				@@ -34,13 +34,14 @@

				#include "expressions.hh"

				#include "conditions.hh"

				#include "cql3/constants.hh"

				#include "cql3/util.hh"

				#include <optional>

				#include "utils/overloaded_functor.hh"

				#include <seastar/json/json_elements.hh>

				#include <boost/algorithm/cxx11/any_of.hpp>

				#include "collection_mutation.hh"

				#include "db/query_context.hh"

				#include "schema.hh"

				#include "schema/schema.hh"

				#include "db/tags/extension.hh"

				#include "db/tags/utils.hh"

				#include "alternator/rmw_operation.hh"

				@@ -50,11 +51,13 @@

				#include <unordered_set>

				#include "service/storage_proxy.hh"

				#include "gms/gossiper.hh"

				#include "schema_registry.hh"

				#include "schema/schema_registry.hh"

				#include "utils/error_injection.hh"

				#include "db/schema_tables.hh"

				#include "utils/rjson.hh"

				using namespace std::chrono_literals;

				logging::logger elogger("alternator-executor");

				namespace alternator {

				@@ -114,8 +117,7 @@ std::string json_string::to_json() const {

				void executor::supplement_table_info(rjson::value& descr, const schema& schema, service::storage_proxy& sp) {

				    rjson::add(descr, "CreationDateTime", rjson::value(std::chrono::duration_cast<std::chrono::seconds>(gc_clock::now().time_since_epoch()).count()));

				    rjson::add(descr, "TableStatus", "ACTIVE");

				    auto schema_id_str = schema.id().to_sstring();

				    rjson::add(descr, "TableId", rjson::from_string(schema_id_str));

				    rjson::add(descr, "TableId", rjson::from_string(schema.id().to_sstring()));

				    executor::supplement_table_stream_info(descr, schema, sp);

				}

				@@ -127,6 +129,20 @@ void executor::supplement_table_info(rjson::value& descr, const schema& schema,

				// See https://github.com/scylladb/scylla/issues/4480

				static constexpr int max_table_name_length = 222;

				static bool valid_table_name_chars(std::string_view name) {

				    for (auto c : name) {

				        if ((c < 'a' || c > 'z') &&

				            (c < 'A' || c > 'Z') &&

				            (c < '0' || c > '9') &&

				            c != '_' &&

				            c != '-' &&

				            c != '.') {

				            return false;

				        }

				    }

				    return true;

				}

				// The DynamoDB developer guide, https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.NamingRulesDataTypes.html#HowItWorks.NamingRules

				// specifies that table names "names must be between 3 and 255 characters long

				// and can contain only the following characters: a-z, A-Z, 0-9, _ (underscore), - (dash), . (dot)

				@@ -136,8 +152,7 @@ static void validate_table_name(const std::string& name) {

				        throw api_error::validation(

				                format("TableName must be at least 3 characters long and at most {} characters long", max_table_name_length));

				    }

				    static const std::regex valid_table_name_chars ("[a-zA-Z0-9_.-]*");

				    if (!std::regex_match(name.c_str(), valid_table_name_chars)) {

				    if (!valid_table_name_chars(name)) {

				        throw api_error::validation(

				                "TableName must satisfy regular expression pattern: [a-zA-Z0-9_.-]+");

				    }

				@@ -153,11 +168,10 @@ static void validate_table_name(const std::string& name) {

				// The view_name() function assumes the table_name has already been validated

				// but validates the legality of index_name and the combination of both.

				static std::string view_name(const std::string& table_name, std::string_view index_name, const std::string& delim = ":") {

				    static const std::regex valid_index_name_chars ("[a-zA-Z0-9_.-]*");

				    if (index_name.length() < 3) {

				        throw api_error::validation("IndexName must be at least 3 characters long");

				    }

				    if (!std::regex_match(index_name.data(), valid_index_name_chars)) {

				    if (!valid_table_name_chars(index_name)) {

				        throw api_error::validation(

				                format("IndexName '{}' must satisfy regular expression pattern: [a-zA-Z0-9_.-]+", index_name));

				    }

				@@ -760,7 +774,6 @@ future<executor::request_return_type> executor::tag_resource(client_state& clien

				        co_return api_error::access_denied("Incorrect resource identifier");

				    }

				    schema_ptr schema = get_table_from_arn(_proxy, rjson::to_string_view(*arn));

				    std::map<sstring, sstring> tags_map = get_tags_of_table_or_throw(schema);

				    const rjson::value* tags = rjson::find(request, "Tags");

				    if (!tags || !tags->IsArray()) {

				        co_return api_error::validation("Cannot parse tags");

				@@ -768,8 +781,9 @@ future<executor::request_return_type> executor::tag_resource(client_state& clien

				    if (tags->Size() < 1) {

				        co_return api_error::validation("The number of tags must be at least 1") ;

				    }

				    update_tags_map(*tags, tags_map,  update_tags_action::add_tags);

				    co_await db::update_tags(_mm, schema, std::move(tags_map));

				    co_await db::modify_tags(_mm, schema->ks_name(), schema->cf_name(), [tags](std::map<sstring, sstring>& tags_map) {

				        update_tags_map(*tags, tags_map, update_tags_action::add_tags);

				    });

				    co_return json_string("");

				}

				@@ -787,9 +801,9 @@ future<executor::request_return_type> executor::untag_resource(client_state& cli

				    schema_ptr schema = get_table_from_arn(_proxy, rjson::to_string_view(*arn));

				    std::map<sstring, sstring> tags_map = get_tags_of_table_or_throw(schema);

				    update_tags_map(*tags, tags_map, update_tags_action::delete_tags);

				    co_await db::update_tags(_mm, schema, std::move(tags_map));

				    co_await db::modify_tags(_mm, schema->ks_name(), schema->cf_name(), [tags](std::map<sstring, sstring>& tags_map) {

				        update_tags_map(*tags, tags_map, update_tags_action::delete_tags);

				    });

				    co_return json_string("");

				}

				@@ -927,9 +941,10 @@ static future<executor::request_return_type> create_table_on_shard0(tracing::tra

				            if  (!range_key.empty() && range_key != view_hash_key && range_key != view_range_key) {

				                add_column(view_builder, range_key, attribute_definitions, column_kind::clustering_key);

				            }

				            sstring where_clause = "\"" + view_hash_key + "\" IS NOT NULL";

				            sstring where_clause = format("{} IS NOT NULL", cql3::util::maybe_quote(view_hash_key));

				            if (!view_range_key.empty()) {

				                where_clause = where_clause + " AND \"" + view_hash_key + "\" IS NOT NULL";

				                where_clause = format("{} AND {} IS NOT NULL", where_clause,

				                    cql3::util::maybe_quote(view_range_key));

				            }

				            where_clauses.push_back(std::move(where_clause));

				            view_builders.emplace_back(std::move(view_builder));

				@@ -984,9 +999,10 @@ static future<executor::request_return_type> create_table_on_shard0(tracing::tra

				            // Note above we don't need to add virtual columns, as all

				            // base columns were copied to view. TODO: reconsider the need

				            // for virtual columns when we support Projection.

				            sstring where_clause = "\"" + view_hash_key + "\" IS NOT NULL";

				            sstring where_clause = format("{} IS NOT NULL", cql3::util::maybe_quote(view_hash_key));

				            if (!view_range_key.empty()) {

				                where_clause = where_clause + " AND \"" + view_range_key + "\" IS NOT NULL";

				                where_clause = format("{} AND {} IS NOT NULL", where_clause,

				                    cql3::util::maybe_quote(view_range_key));

				            }

				            where_clauses.push_back(std::move(where_clause));

				            view_builders.emplace_back(std::move(view_builder));

				@@ -1529,7 +1545,7 @@ future<executor::request_return_type> rmw_operation::execute(service::storage_pr

				            // This is the old, unsafe, read before write which does first

				            // a read, then a write. TODO: remove this mode entirely.

				            return get_previous_item(proxy, client_state, schema(), _pk, _ck, permit, stats).then(

				                    [this, &client_state, &proxy, trace_state, permit = std::move(permit)] (std::unique_ptr<rjson::value> previous_item) mutable {

				                    [this, &proxy, trace_state, permit = std::move(permit)] (std::unique_ptr<rjson::value> previous_item) mutable {

				                std::optional<mutation> m = apply(std::move(previous_item), api::new_timestamp());

				                if (!m) {

				                    return make_ready_future<executor::request_return_type>(api_error::conditional_check_failed("Failed condition."));

				@@ -2302,7 +2318,7 @@ void executor::describe_single_item(const cql3::selection::selection& selection,

				                rjson::add_with_string_name(field, type_to_string((*column_it)->type), json_key_column_value(*cell, **column_it));

				            }

				        } else if (cell) {

				            auto deserialized = attrs_type()->deserialize(*cell, cql_serialization_format::latest());

				            auto deserialized = attrs_type()->deserialize(*cell);

				            auto keys_and_values = value_cast<map_type_impl::native_type>(deserialized);

				            for (auto entry : keys_and_values) {

				                std::string attr_name = value_cast<sstring>(entry.first);

				@@ -2337,7 +2353,7 @@ std::optional<rjson::value> executor::describe_single_item(schema_ptr schema,

				        const std::optional<attrs_to_get>& attrs_to_get) {

				    rjson::value item = rjson::empty_object();

				    cql3::selection::result_set_builder builder(selection, gc_clock::now(), cql_serialization_format::latest());

				    cql3::selection::result_set_builder builder(selection, gc_clock::now());

				    query::result_view::consume(query_result, slice, cql3::selection::result_set_builder::visitor(builder, *schema, selection));

				    auto result_set = builder.build();

				@@ -2360,7 +2376,7 @@ std::vector<rjson::value> executor::describe_multi_item(schema_ptr schema,

				        const cql3::selection::selection& selection,

				        const query::result& query_result,

				        const std::optional<attrs_to_get>& attrs_to_get) {

				    cql3::selection::result_set_builder builder(selection, gc_clock::now(), cql_serialization_format::latest());

				    cql3::selection::result_set_builder builder(selection, gc_clock::now());

				    query::result_view::consume(query_result, slice, cql3::selection::result_set_builder::visitor(builder, *schema, selection));

				    auto result_set = builder.build();

				    std::vector<rjson::value> ret;

				@@ -3104,20 +3120,10 @@ future<executor::request_return_type> executor::get_item(client_state& client_st

				    });

				}

				// is_big() checks approximately if the given JSON value is "bigger" than

				// the given big_size number of bytes. The goal is to *quickly* detect

				// oversized JSON that, for example, is too large to be serialized to a

				// contiguous string - we don't need an accurate size for that. Moreover,

				// as soon as we detect that the JSON is indeed "big", we can return true

				// and don't need to continue calculating its exact size.

				// For simplicity, we use a recursive implementation. This is fine because

				// Alternator limits the depth of JSONs it reads from inputs, and doesn't

				// add more than a couple of levels in its own output construction.

				static void check_big_object(const rjson::value& val, int& size_left);

				static void check_big_array(const rjson::value& val, int& size_left);

				static bool is_big(const rjson::value& val, int big_size = 100'000) {

				bool is_big(const rjson::value& val, int big_size) {

				    if (val.IsString()) {

				        return ssize_t(val.GetStringLength()) > big_size;

				    } else if (val.IsObject()) {

				@@ -3508,7 +3514,7 @@ public:

				                    rjson::add_with_string_name(field, type_to_string((*_column_it)->type), json_key_column_value(bv, **_column_it));

				                }

				            } else {

				                auto deserialized = attrs_type()->deserialize(bv, cql_serialization_format::latest());

				                auto deserialized = attrs_type()->deserialize(bv);

				                auto keys_and_values = value_cast<map_type_impl::native_type>(deserialized);

				                for (auto entry : keys_and_values) {

				                    std::string attr_name = value_cast<sstring>(entry.first);

				@@ -3565,7 +3571,7 @@ public:

				    }

				};

				static std::tuple<rjson::value, size_t> describe_items(schema_ptr schema, const query::partition_slice& slice, const cql3::selection::selection& selection, std::unique_ptr<cql3::result_set> result_set, std::optional<attrs_to_get>&& attrs_to_get, filter&& filter) {

				static std::tuple<rjson::value, size_t> describe_items(const cql3::selection::selection& selection, std::unique_ptr<cql3::result_set> result_set, std::optional<attrs_to_get>&& attrs_to_get, filter&& filter) {

				    describe_items_visitor visitor(selection.get_columns(), attrs_to_get, filter);

				    result_set->visit(visitor);

				    auto scanned_count = visitor.get_scanned_count();

				@@ -3615,7 +3621,7 @@ static rjson::value encode_paging_state(const schema& schema, const service::pag

				    // We conditionally include these fields when reading CQL tables through alternator.

				    if (!is_alternator_keyspace(schema.ks_name()) && (!pos.has_key() || pos.get_bound_weight() != bound_weight::equal)) {

				        rjson::add_with_string_name(last_evaluated_key, scylla_paging_region, rjson::empty_object());

				        rjson::add(last_evaluated_key[scylla_paging_region.data()], "S", rjson::from_string(to_string(pos.region())));

				        rjson::add(last_evaluated_key[scylla_paging_region.data()], "S", rjson::from_string(fmt::to_string(pos.region())));

				        rjson::add_with_string_name(last_evaluated_key, scylla_paging_weight, rjson::empty_object());

				        rjson::add(last_evaluated_key[scylla_paging_weight.data()], "N", static_cast<int>(pos.get_bound_weight()));

				    }

				@@ -3642,7 +3648,7 @@ static future<executor::request_return_type> do_query(service::storage_proxy& pr

				    if (exclusive_start_key) {

				        partition_key pk = pk_from_json(*exclusive_start_key, schema);

				        auto pos = position_in_partition(position_in_partition::partition_start_tag_t());

				        auto pos = position_in_partition::for_partition_start();

				        if (schema->clustering_key_size() > 0) {

				            pos = pos_from_json(*exclusive_start_key, schema);

				        }

				@@ -3679,7 +3685,7 @@ static future<executor::request_return_type> do_query(service::storage_proxy& pr

				        }

				        auto paging_state = rs->get_metadata().paging_state();

				        bool has_filter = filter;

				        auto [items, size] = describe_items(schema, partition_slice, *selection, std::move(rs), std::move(attrs_to_get), std::move(filter));

				        auto [items, size] = describe_items(*selection, std::move(rs), std::move(attrs_to_get), std::move(filter));

				        if (paging_state) {

				            rjson::add(items, "LastEvaluatedKey", encode_paging_state(*schema, *paging_state));

				        }

				@@ -3688,8 +3694,7 @@ static future<executor::request_return_type> do_query(service::storage_proxy& pr

				            // update our "filtered_row_matched_total" for all the rows matched, despited the filter

				            cql_stats.filtered_rows_matched_total += size;

				        }

				        // TODO: better threshold

				        if (size > 10) {

				        if (is_big(items)) {

				            return make_ready_future<executor::request_return_type>(make_streamed(std::move(items)));

				        }

				        return make_ready_future<executor::request_return_type>(make_jsonable(std::move(items)));

									
										11

alternator/executor.hh
									
												View File
												
				@@ -239,4 +239,15 @@ public:

				    static void supplement_table_stream_info(rjson::value& descr, const schema& schema, service::storage_proxy& sp);

				};

				// is_big() checks approximately if the given JSON value is "bigger" than

				// the given big_size number of bytes. The goal is to *quickly* detect

				// oversized JSON that, for example, is too large to be serialized to a

				// contiguous string - we don't need an accurate size for that. Moreover,

				// as soon as we detect that the JSON is indeed "big", we can return true

				// and don't need to continue calculating its exact size.

				// For simplicity, we use a recursive implementation. This is fine because

				// Alternator limits the depth of JSONs it reads from inputs, and doesn't

				// add more than a couple of levels in its own output construction.

				bool is_big(const rjson::value& val, int big_size = 100'000);

				}

									
										3

alternator/expressions.cc
									
												View File
												
				@@ -634,7 +634,8 @@ std::unordered_map<std::string_view, function_handler_type*> function_handlers {

				            }

				            rjson::value v1 = calculate_value(f._parameters[0], caller, previous_item);

				            rjson::value v2 = calculate_value(f._parameters[1], caller, previous_item);

				            return to_bool_json(check_CONTAINS(v1.IsNull() ? nullptr : &v1,  v2));

				            return to_bool_json(check_CONTAINS(v1.IsNull() ? nullptr : &v1,  v2,

				                                    f._parameters[0].is_constant(), f._parameters[1].is_constant()));

				        }

				    },

				};

									
										2

alternator/expressions_types.hh
									
												View File
												
				@@ -19,7 +19,7 @@

				/*

				 * Parsed representation of expressions and their components.

				 *

				 * Types in alternator::parse namespace are used for holding the parse

				 * Types in alternator::parsed namespace are used for holding the parse

				 * tree - objects generated by the Antlr rules after parsing an expression.

				 * Because of the way Antlr works, all these objects are default-constructed

				 * first, and then assigned when the rule is completed, so all these types

									
										32

alternator/serialization.cc
									
												View File
												
				@@ -14,7 +14,7 @@

				#include "rapidjson/writer.h"

				#include "concrete_types.hh"

				#include "cql3/type_json.hh"

				#include "position_in_partition.hh"

				#include "mutation/position_in_partition.hh"

				static logging::logger slogger("alternator-serialization");

				@@ -59,7 +59,9 @@ struct from_json_visitor {

				        bo.write(t.from_string(rjson::to_string_view(v)));

				    }

				    void operator()(const bytes_type_impl& t) const {

				        bo.write(rjson::base64_decode(v));

				        // FIXME: it's difficult at this point to get information if value was provided

				        // in request or comes from the storage, for now we assume it's user's fault.

				        bo.write(*unwrap_bytes(v, true));

				    }

				    void operator()(const boolean_type_impl& t) const {

				        bo.write(boolean_type->decompose(v.GetBool()));

				@@ -73,7 +75,7 @@ struct from_json_visitor {

				    }

				    // default

				    void operator()(const abstract_type& t) const {

				        bo.write(from_json_object(t, v, cql_serialization_format::internal()));

				        bo.write(from_json_object(t, v));

				    }

				};

				@@ -198,7 +200,9 @@ bytes get_key_from_typed_value(const rjson::value& key_typed_value, const column

				                format("The AttributeValue for a key attribute cannot contain an empty string value. Key: {}", column.name_as_text()));

				    }

				    if (column.type == bytes_type) {

				        return rjson::base64_decode(value);

				        // FIXME: it's difficult at this point to get information if value was provided

				        // in request or comes from the storage, for now we assume it's user's fault.

				        return *unwrap_bytes(value, true);

				    } else {

				        return column.type->from_string(value_view);

				    }

				@@ -210,7 +214,7 @@ rjson::value json_key_column_value(bytes_view cell, const column_definition& col

				        std::string b64 = base64_encode(cell);

				        return rjson::from_string(b64);

				    } if (column.type == utf8_type) {

				        return rjson::from_string(std::string(reinterpret_cast<const char*>(cell.data()), cell.size()));

				        return rjson::from_string(reinterpret_cast<const char*>(cell.data()), cell.size());

				    } else if (column.type == decimal_type) {

				        // FIXME: use specialized Alternator number type, not the more

				        // general "decimal_type". A dedicated type can be more efficient

				@@ -261,7 +265,6 @@ position_in_partition pos_from_json(const rjson::value& item, schema_ptr schema)

				    if (bool(region_item) != bool(weight_item)) {

				        throw api_error::validation("Malformed value object: region and weight has to be either both missing or both present");

				    }

				    partition_region region;

				    bound_weight weight;

				    if (region_item) {

				        auto region_view = rjson::to_string_view(get_typed_value(*region_item, "S", scylla_paging_region, "key region"));

				@@ -279,7 +282,7 @@ position_in_partition pos_from_json(const rjson::value& item, schema_ptr schema)

				        return position_in_partition(region, weight, region == partition_region::clustered ? std::optional(std::move(ck)) : std::nullopt);

				    }

				    if (ck.is_empty()) {

				        return position_in_partition(position_in_partition::partition_start_tag_t());

				        return position_in_partition::for_partition_start();

				    }

				    return position_in_partition::for_key(std::move(ck));

				}

				@@ -319,6 +322,17 @@ std::optional<big_decimal> try_unwrap_number(const rjson::value& v) {

				    }

				}

				std::optional<bytes> unwrap_bytes(const rjson::value& value, bool from_query) {

				    try {

				        return rjson::base64_decode(value);

				    } catch (...) {

				        if (from_query) {

				            throw api_error::serialization(format("Invalid base64 data"));

				        }

				        return std::nullopt;

				    }

				}

				const std::pair<std::string, const rjson::value*> unwrap_set(const rjson::value& v) {

				    if (!v.IsObject() || v.MemberCount() != 1) {

				        return {"", nullptr};

				@@ -348,7 +362,7 @@ rjson::value number_add(const rjson::value& v1, const rjson::value& v2) {

				    auto n1 = unwrap_number(v1, "UpdateExpression");

				    auto n2 = unwrap_number(v2, "UpdateExpression");

				    rjson::value ret = rjson::empty_object();

				    std::string str_ret = std::string((n1 + n2).to_string());

				    sstring str_ret = (n1 + n2).to_string();

				    rjson::add(ret, "N", rjson::from_string(str_ret));

				    return ret;

				}

				@@ -357,7 +371,7 @@ rjson::value number_subtract(const rjson::value& v1, const rjson::value& v2) {

				    auto n1 = unwrap_number(v1, "UpdateExpression");

				    auto n2 = unwrap_number(v2, "UpdateExpression");

				    rjson::value ret = rjson::empty_object();

				    std::string str_ret = std::string((n1 - n2).to_string());

				    sstring str_ret = (n1 - n2).to_string();

				    rjson::add(ret, "N", rjson::from_string(str_ret));

				    return ret;

				}

									
										9

alternator/serialization.hh
									
												View File
												
				@@ -11,8 +11,8 @@

				#include <string>

				#include <string_view>

				#include <optional>

				#include "types.hh"

				#include "schema_fwd.hh"

				#include "types/types.hh"

				#include "schema/schema_fwd.hh"

				#include "keys.hh"

				#include "utils/rjson.hh"

				#include "utils/big_decimal.hh"

				@@ -62,6 +62,11 @@ big_decimal unwrap_number(const rjson::value& v, std::string_view diagnostic);

				// when the given v does not encode a number.

				std::optional<big_decimal> try_unwrap_number(const rjson::value& v);

				// unwrap_bytes decodes byte value, on decoding failure it either raises api_error::serialization

				// iff from_query is true or returns unset optional iff from_query is false.

				// Therefore it's safe to dereference returned optional when called with from_query equal true.

				std::optional<bytes> unwrap_bytes(const rjson::value& value, bool from_query);

				// Check if a given JSON object encodes a set (i.e., it is a {"SS": [...]}, or "NS", "BS"

				// and returns set's type and a pointer to that set. If the object does not encode a set,

				// returned value is {"", nullptr}

									
										14

alternator/server.cc
									
												View File
												
				@@ -24,10 +24,13 @@

				#include "gms/gossiper.hh"

				#include "utils/overloaded_functor.hh"

				#include "utils/fb_utilities.hh"

				#include "utils/aws_sigv4.hh"

				static logging::logger slogger("alternator-server");

				using namespace httpd;

				using request = http::request;

				using reply = http::reply;

				namespace alternator {

				@@ -143,7 +146,7 @@ public:

				            std::unique_ptr<request> req, std::unique_ptr<reply> rep) override {

				        handle_CORS(*req, *rep, false);

				        return _f_handle(std::move(req), std::move(rep)).then(

				                [this](std::unique_ptr<reply> rep) {

				                [](std::unique_ptr<reply> rep) {

				                    rep->set_mime_type("application/x-amz-json-1.0");

				                    rep->done();

				                    return make_ready_future<std::unique_ptr<reply>>(std::move(rep));

				@@ -317,8 +320,13 @@ future<std::string> server::verify_signature(const request& req, const chunked_c

				                                                    region = std::move(region),

				                                                    service = std::move(service),

				                                                    user_signature = std::move(user_signature)] (key_cache::value_ptr key_ptr) {

				        std::string signature = get_signature(user, *key_ptr, std::string_view(host), req._method,

				                datestamp, signed_headers_str, signed_headers_map, content, region, service, "");

				        std::string signature;

				        try {

				            signature = utils::aws::get_signature(user, *key_ptr, std::string_view(host), "/", req._method,

				                datestamp, signed_headers_str, signed_headers_map, &content, region, service, "");

				        } catch (const std::exception& e) {

				            throw api_error::invalid_signature(e.what());

				        }

				        if (signature != std::string_view(user_signature)) {

				            _key_cache.remove(user);

									
										10

alternator/server.hh
									
												View File
												
				@@ -27,11 +27,11 @@ using chunked_content = rjson::chunked_content;

				class server {

				    static constexpr size_t content_length_limit = 16*MB;

				    using alternator_callback = std::function<future<executor::request_return_type>(executor&, executor::client_state&,

				            tracing::trace_state_ptr, service_permit, rjson::value, std::unique_ptr<request>)>;

				            tracing::trace_state_ptr, service_permit, rjson::value, std::unique_ptr<http::request>)>;

				    using alternator_callbacks_map = std::unordered_map<std::string_view, alternator_callback>;

				    http_server _http_server;

				    http_server _https_server;

				    httpd::http_server _http_server;

				    httpd::http_server _https_server;

				    executor& _executor;

				    service::storage_proxy& _proxy;

				    gms::gossiper& _gossiper;

				@@ -76,8 +76,8 @@ public:

				private:

				    void set_routes(seastar::httpd::routes& r);

				    // If verification succeeds, returns the authenticated user's username

				    future<std::string> verify_signature(const seastar::httpd::request&, const chunked_content&);

				    future<executor::request_return_type> handle_api_request(std::unique_ptr<request> req);

				    future<std::string> verify_signature(const seastar::http::request&, const chunked_content&);

				    future<executor::request_return_type> handle_api_request(std::unique_ptr<http::request> req);

				};

				}

									
										60

alternator/streams.cc
									
												View File
												
				@@ -27,13 +27,14 @@

				#include "cql3/result_set.hh"

				#include "cql3/type_json.hh"

				#include "cql3/column_identifier.hh"

				#include "schema_builder.hh"

				#include "schema/schema_builder.hh"

				#include "service/storage_proxy.hh"

				#include "gms/feature.hh"

				#include "gms/feature_service.hh"

				#include "executor.hh"

				#include "rmw_operation.hh"

				#include "data_dictionary/data_dictionary.hh"

				/**

				 * Base template type to implement  rapidjson::internal::TypeHelper<...>:s

				@@ -140,24 +141,43 @@ namespace alternator {

				future<alternator::executor::request_return_type> alternator::executor::list_streams(client_state& client_state, service_permit permit, rjson::value request) {

				    _stats.api_operations.list_streams++;

				    auto limit = rjson::get_opt<int>(request, "Limit").value_or(std::numeric_limits<int>::max());

				    auto limit = rjson::get_opt<int>(request, "Limit").value_or(100);

				    auto streams_start = rjson::get_opt<stream_arn>(request, "ExclusiveStartStreamArn");

				    auto table = find_table(_proxy, request);

				    auto db = _proxy.data_dictionary();

				    auto cfs = db.get_tables();

				    auto i = cfs.begin();

				    auto e = cfs.end();

				    if (limit < 1) {

				        throw api_error::validation("Limit must be 1 or more");

				    }

				    // TODO: the unordered_map here is not really well suited for partial

				    // querying - we're sorting on local hash order, and creating a table

				    // between queries may or may not miss info. But that should be rare,

				    // and we can probably expect this to be a single call.

				    std::vector<data_dictionary::table> cfs;

				    if (table) {

				        auto log_name = cdc::log_name(table->cf_name());

				        try {

				            cfs.emplace_back(db.find_table(table->ks_name(), log_name));

				        } catch (data_dictionary::no_such_column_family&) {

				            cfs.clear();

				        }

				    } else {

				        cfs = db.get_tables();

				    }

				    // # 12601 (maybe?) - sort the set of tables on ID. This should ensure we never

				    // generate duplicates in a paged listing here. Can obviously miss things if they 

				    // are added between paged calls and end up with a "smaller" UUID/ARN, but that 

				    // is to be expected.

				    if (std::cmp_less(limit, cfs.size()) || streams_start) {

				        std::sort(cfs.begin(), cfs.end(), [](const data_dictionary::table& t1, const data_dictionary::table& t2) {

				            return t1.schema()->id().uuid() < t2.schema()->id().uuid();

				        });

				    }

				    auto i = cfs.begin();

				    auto e = cfs.end();

				    if (streams_start) {

				        i = std::find_if(i, e, [&](data_dictionary::table t) {

				        i = std::find_if(i, e, [&](const data_dictionary::table& t) {

				            return t.schema()->id().uuid() == streams_start

				                && cdc::get_base_table(db.real_database(), *t.schema())

				                && is_alternator_keyspace(t.schema()->ks_name())

				@@ -181,14 +201,7 @@ future<alternator::executor::request_return_type> alternator::executor::list_str

				        if (!is_alternator_keyspace(ks_name)) {

				            continue;

				        }

				        if (table && ks_name != table->ks_name()) {

				            continue;

				        }

				        if (cdc::is_log_for_some_table(db.real_database(), ks_name, cf_name)) {

				            if (table && table != cdc::get_base_table(db.real_database(), *s)) {

				                continue;

				            }

				            rjson::value new_entry = rjson::empty_object();

				            last = i->schema()->id();

				@@ -416,6 +429,8 @@ static std::chrono::seconds confidence_interval(data_dictionary::database db) {

				    return std::chrono::seconds(db.get_config().alternator_streams_time_window_s());

				}

				using namespace std::chrono_literals;

				// Dynamo docs says no data shall live longer than 24h.

				static constexpr auto dynamodb_streams_max_window = 24h;

				@@ -493,7 +508,7 @@ future<executor::request_return_type> executor::describe_stream(client_state& cl

				    // filter out cdc generations older than the table or now() - cdc::ttl (typically dynamodb_streams_max_window - 24h)

				    auto low_ts = std::max(as_timepoint(schema->id()), db_clock::now() - ttl);

				    return _sdks.cdc_get_versioned_streams(low_ts, { normal_token_owners }).then([this, db, shard_start, limit, ret = std::move(ret), stream_desc = std::move(stream_desc)] (std::map<db_clock::time_point, cdc::streams_version> topologies) mutable {

				    return _sdks.cdc_get_versioned_streams(low_ts, { normal_token_owners }).then([db, shard_start, limit, ret = std::move(ret), stream_desc = std::move(stream_desc)] (std::map<db_clock::time_point, cdc::streams_version> topologies) mutable {

				        auto e = topologies.end();

				        auto prev = e;

				@@ -812,7 +827,7 @@ future<executor::request_return_type> executor::get_records(client_state& client

				    }

				    if (!schema || !base || !is_alternator_keyspace(schema->ks_name())) {

				        throw api_error::resource_not_found(boost::lexical_cast<std::string>(iter.table));

				        throw api_error::resource_not_found(fmt::to_string(iter.table));

				    }

				    tracing::add_table_name(trace_state, schema->ks_name(), schema->cf_name());

				@@ -883,7 +898,7 @@ future<executor::request_return_type> executor::get_records(client_state& client

				    return _proxy.query(schema, std::move(command), std::move(partition_ranges), cl, service::storage_proxy::coordinator_query_options(default_timeout(), std::move(permit), client_state)).then(

				            [this, schema, partition_slice = std::move(partition_slice), selection = std::move(selection), start_time = std::move(start_time), limit, key_names = std::move(key_names), attr_names = std::move(attr_names), type, iter, high_ts] (service::storage_proxy::coordinator_query_result qr) mutable {       

				        cql3::selection::result_set_builder builder(*selection, gc_clock::now(), cql_serialization_format::latest());

				        cql3::selection::result_set_builder builder(*selection, gc_clock::now());

				        query::result_view::consume(*qr.query_result, partition_slice, cql3::selection::result_set_builder::visitor(builder, *schema, *selection));

				        auto result_set = builder.build();

				@@ -1012,7 +1027,7 @@ future<executor::request_return_type> executor::get_records(client_state& client

				        // ugh. figure out if we are and end-of-shard

				        auto normal_token_owners = _proxy.get_token_metadata_ptr()->count_normal_token_owners();

				        return _sdks.cdc_current_generation_timestamp({ normal_token_owners }).then([this, iter, high_ts, start_time, ret = std::move(ret), nrecords](db_clock::time_point ts) mutable {

				        return _sdks.cdc_current_generation_timestamp({ normal_token_owners }).then([this, iter, high_ts, start_time, ret = std::move(ret)](db_clock::time_point ts) mutable {

				            auto& shard = iter.shard;            

				            if (shard.time < ts && ts < high_ts) {

				@@ -1029,8 +1044,7 @@ future<executor::request_return_type> executor::get_records(client_state& client

				                rjson::add(ret, "NextShardIterator", iter);

				            }

				            _stats.api_operations.get_records_latency.add(std::chrono::steady_clock::now() - start_time);

				            // TODO: determine a better threshold...

				            if (nrecords > 10) {

				            if (is_big(ret)) {

				                return make_ready_future<executor::request_return_type>(make_streamed(std::move(ret)));

				            }

				            return make_ready_future<executor::request_return_type>(make_jsonable(std::move(ret)));

									
										74

alternator/ttl.cc
									
												View File
												
				@@ -8,6 +8,7 @@

				#include <chrono>

				#include <cstdint>

				#include <exception>

				#include <optional>

				#include <seastar/core/sstring.hh>

				#include <seastar/core/coroutine.hh>

				@@ -17,6 +18,7 @@

				#include <seastar/coroutine/maybe_yield.hh>

				#include <boost/multiprecision/cpp_int.hpp>

				#include "exceptions/exceptions.hh"

				#include "gms/gossiper.hh"

				#include "gms/inet_address.hh"

				#include "inet_address_vectors.hh"

				@@ -31,8 +33,8 @@

				#include "service/pager/query_pagers.hh"

				#include "gms/feature_service.hh"

				#include "sstables/types.hh"

				#include "mutation.hh"

				#include "types.hh"

				#include "mutation/mutation.hh"

				#include "types/types.hh"

				#include "types/map.hh"

				#include "utils/rjson.hh"

				#include "utils/big_decimal.hh"

				@@ -92,24 +94,25 @@ future<executor::request_return_type> executor::update_time_to_live(client_state

				    }

				    sstring attribute_name(v->GetString(), v->GetStringLength());

				    std::map<sstring, sstring> tags_map = get_tags_of_table_or_throw(schema);

				    if (enabled) {

				        if (tags_map.contains(TTL_TAG_KEY)) {

				            co_return api_error::validation("TTL is already enabled");

				    co_await db::modify_tags(_mm, schema->ks_name(), schema->cf_name(), [&](std::map<sstring, sstring>& tags_map) {

				        if (enabled) {

				            if (tags_map.contains(TTL_TAG_KEY)) {

				                throw api_error::validation("TTL is already enabled");

				            }

				            tags_map[TTL_TAG_KEY] = attribute_name;

				        } else {

				            auto i = tags_map.find(TTL_TAG_KEY);

				            if (i == tags_map.end()) {

				                throw api_error::validation("TTL is already disabled");

				            } else if (i->second != attribute_name) {

				                throw api_error::validation(format(

				                    "Requested to disable TTL on attribute {}, but a different attribute {} is enabled.",

				                    attribute_name, i->second));

				            }

				            tags_map.erase(TTL_TAG_KEY);

				        }

				        tags_map[TTL_TAG_KEY] = attribute_name;

				    } else {

				        auto i = tags_map.find(TTL_TAG_KEY);

				        if (i == tags_map.end()) {

				            co_return api_error::validation("TTL is already disabled");

				        } else if (i->second != attribute_name) {

				            co_return api_error::validation(format(

				                "Requested to disable TTL on attribute {}, but a different attribute {} is enabled.",

				                attribute_name, i->second));

				        }

				        tags_map.erase(TTL_TAG_KEY);

				    }

				    co_await db::update_tags(_mm, schema, std::move(tags_map));

				    });

				    // Prepare the response, which contains a TimeToLiveSpecification

				    // basically identical to the request's

				    rjson::value response = rjson::empty_object();

				@@ -548,13 +551,34 @@ static future<> scan_table_ranges(

				            co_return;

				        }

				        auto units = co_await get_units(page_sem, 1);

				        // We don't to limit page size in number of rows because there is a

				        // builtin limit of the page's size in bytes. Setting this limit to 1

				        // is useful for debugging the paging code with moderate-size data.

				        // We don't need to limit page size in number of rows because there is

				        // a builtin limit of the page's size in bytes. Setting this limit to

				        // 1 is useful for debugging the paging code with moderate-size data.

				        uint32_t limit = std::numeric_limits<uint32_t>::max();

				        // FIXME: which timeout?

				        // FIXME: if read times out, need to retry it.

				        std::unique_ptr<cql3::result_set> rs = co_await p->fetch_page(limit, gc_clock::now(), executor::default_timeout());

				        // Read a page, and if that times out, try again after a small sleep.

				        // If we didn't catch the timeout exception, it would cause the scan

				        // be aborted and only be restarted at the next scanning period.

				        // If we retry too many times, give up and restart the scan later.

				        std::unique_ptr<cql3::result_set> rs;

				        for (int retries=0; ; retries++) {

				            try {

				                // FIXME: which timeout?

				                rs = co_await p->fetch_page(limit, gc_clock::now(), executor::default_timeout());

				                break;

				            } catch(exceptions::read_timeout_exception&) {

				                tlogger.warn("expiration scanner read timed out, will retry: {}",

				                    std::current_exception());

				            }

				            // If we didn't break out of this loop, add a minimal sleep

				            if (retries >= 10) {

				                // Don't get stuck forever asking the same page, maybe there's

				                // a bug or a real problem in several replicas. Give up on

				                // this scan an retry the scan from a random position later,

				                // in the next scan period.

				                throw runtime_exception("scanner thread failed after too many timeouts for the same page");

				            }

				            co_await sleep_abortable(std::chrono::seconds(1), abort_source);

				        }

				        auto rows = rs->rows();

				        auto meta = rs->get_metadata().get_names();

				        std::optional<unsigned> expiration_column;

									
										15

amplify.yml
									
										Normal file
									
												View File
												
				@@ -0,0 +1,15 @@

				version: 1

				applications:

				  - frontend:

				      phases:

				        build:

				          commands:

				            - make setupenv

				            - make dirhtml

				      artifacts:

				        baseDirectory: _build/dirhtml

				        files:

				          - '**/*'

				      cache:

				        paths: []

				    appRoot: docs

									
										70

api/CMakeLists.txt
									
										Normal file
									
												View File
												
				@@ -0,0 +1,70 @@

				# Generate C++ sources from Swagger definitions

				set(swagger_files

				  api-doc/authorization_cache.json

				  api-doc/cache_service.json

				  api-doc/collectd.json

				  api-doc/column_family.json

				  api-doc/commitlog.json

				  api-doc/compaction_manager.json

				  api-doc/config.json

				  api-doc/endpoint_snitch_info.json

				  api-doc/error_injection.json

				  api-doc/failure_detector.json

				  api-doc/gossiper.json

				  api-doc/hinted_handoff.json

				  api-doc/lsa.json

				  api-doc/messaging_service.json

				  api-doc/storage_proxy.json

				  api-doc/storage_service.json

				  api-doc/stream_manager.json

				  api-doc/system.json

				  api-doc/task_manager.json

				  api-doc/task_manager_test.json

				  api-doc/utils.json)

				foreach(f ${swagger_files})

				  get_filename_component(fname "${f}" NAME_WE)

				  get_filename_component(dir "${f}" DIRECTORY)

				  seastar_generate_swagger(

				    TARGET scylla_swagger_gen_${fname}

				    VAR scylla_swagger_gen_${fname}_files

				    IN_FILE "${CMAKE_CURRENT_SOURCE_DIR}/${f}"

				    OUT_DIR "${scylla_gen_build_dir}/api/${dir}")

				  list(APPEND swagger_gen_files "${scylla_swagger_gen_${fname}_files}")

				endforeach()

				add_library(api)

				target_sources(api

				  PRIVATE

				    api.cc

				    cache_service.cc

				    collectd.cc

				    column_family.cc

				    commitlog.cc

				    compaction_manager.cc

				    config.cc

				    endpoint_snitch.cc

				    error_injection.cc

				    authorization_cache.cc

				    failure_detector.cc

				    gossiper.cc

				    hinted_handoff.cc

				    lsa.cc

				    messaging_service.cc

				    storage_proxy.cc

				    storage_service.cc

				    stream_manager.cc

				    system.cc

				    task_manager.cc

				    task_manager_test.cc

				    ${swagger_gen_files})

				target_include_directories(api

				  PUBLIC

				    ${CMAKE_SOURCE_DIR}

				    ${scylla_gen_build_dir})

				target_link_libraries(api

				  idl

				  wasmtime_bindings

				  Seastar::seastar

				  xxHash::xxhash)

									
										4

api/api-doc/storage_service.json
									
												View File
												
				@@ -1228,7 +1228,7 @@

				         "operations":[

				            {

				               "method":"POST",

				               "summary":"Removes token (and all data associated with enpoint that had it) from the ring",

				               "summary":"Removes a node from the cluster. Replicated data that logically belonged to this node is redistributed among the remaining nodes.",

				               "type":"void",

				               "nickname":"remove_node",

				               "produces":[

				@@ -1245,7 +1245,7 @@

				                  },

				                  {

				                     "name":"ignore_nodes",

				                     "description":"List of dead nodes to ingore in removenode operation",

				                     "description":"Comma-separated list of dead nodes to ignore in removenode operation. Use the same method for all nodes to ignore: either Host IDs or ip addresses.",

				                     "required":false,

				                     "allowMultiple":false,

				                     "type":"string",

									
										86

api/api-doc/task_manager.json
									
												View File
												
				@@ -49,6 +49,14 @@

				                        "type":"string",

				                        "paramType":"path"

				                    },

				                    {

				                        "name":"internal",

				                        "description":"Boolean flag indicating whether internal tasks should be shown (false by default)",

				                        "required":false,

				                        "allowMultiple":false,

				                        "type":"boolean",

				                        "paramType":"query"

				                    },

				                    {

				                        "name":"keyspace",

				                        "description":"The keyspace to query about",

				@@ -140,6 +148,57 @@

				              ]

				           }

				        ]

				     },

				     {

				      "path":"/task_manager/task_status_recursive/{task_id}",

				      "operations":[

				         {

				            "method":"GET",

				            "summary":"Get statuses of the task and all its descendants",

				            "type":"array",

				            "items":{

				               "type":"task_status"

				            },

				            "nickname":"get_task_status_recursively",

				            "produces":[

				               "application/json"

				            ],

				            "parameters":[

				                {

				                    "name":"task_id",

				                    "description":"The uuid of a task to query about",

				                    "required":true,

				                    "allowMultiple":false,

				                    "type":"string",

				                    "paramType":"path"

				                }

				            ]

				         }

				      ]

				     },

				     {

				         "path":"/task_manager/ttl",

				         "operations":[

				            {

				               "method":"POST",

				               "summary":"Set ttl in seconds and get last value",

				               "type":"long",

				               "nickname":"get_and_update_ttl",

				               "produces":[

				                  "application/json"

				               ],

				               "parameters":[

				                  {

				                     "name":"ttl",

				                     "description":"The number of seconds for which the tasks will be kept in memory after it finishes",

				                     "required":true,

				                     "allowMultiple":false,

				                     "type":"long",

				                     "paramType":"query"

				                  }

				               ]

				            }

				         ]

				     }

				    ],

				    "models":{

				@@ -160,6 +219,26 @@

				                  "failed"

				                ],

				                "description":"The state of a task"

				             },

				             "type":{

				                "type":"string",

				                "description":"The description of the task"

				             },

				             "keyspace":{

				                "type":"string",

				                "description":"The keyspace the task is working on (if applicable)"

				             },

				             "table":{

				                "type":"string",

				                "description":"The table the task is working on (if applicable)"

				             },

				             "entity":{

				                "type":"string",

				                "description":"Task-specific entity description"

				             },

				             "sequence_number":{

				                "type":"long",

				                "description":"The running sequence number of the task"

				             }

				           }

				       },

				@@ -236,6 +315,13 @@

				            "progress_completed":{

				               "type":"double",

				               "description":"The number of units completed so far"

				            },

				            "children_ids":{

				               "type":"array",

				                "items":{

				                    "type":"string"

				                },

				               "description":"Task IDs of children of this task"

				            }

				          }

				       }

									
										34

api/api-doc/task_manager_test.json
									
												View File
												
				@@ -86,14 +86,6 @@

				                        "type":"string",

				                        "paramType":"query"

				                    },

				                    {

				                        "name":"type",

				                        "description":"The type of the task",

				                        "required":false,

				                        "allowMultiple":false,

				                        "type":"string",

				                        "paramType":"query"

				                    },

				                    {

				                        "name":"entity",

				                        "description":"Task-specific entity description",

				@@ -156,30 +148,6 @@

				                ]

				             }

				          ]

				       },

				       {

				         "path":"/task_manager_test/ttl",

				         "operations":[

				            {

				               "method":"POST",

				               "summary":"Set ttl in seconds and get last value",

				               "type":"long",

				               "nickname":"get_and_update_ttl",

				               "produces":[

				                  "application/json"

				               ],

				               "parameters":[

				                  {

				                     "name":"ttl",

				                     "description":"The number of seconds for which the tasks will be kept in memory after it finishes",

				                     "required":true,

				                     "allowMultiple":false,

				                     "type":"long",

				                     "paramType":"query"

				                  }

				               ]

				            }

				         ]

				      }

				       }

				    ]

				 }

									
										27

api/api.cc
									
												View File
												
				@@ -35,6 +35,7 @@

				logging::logger apilog("api");

				namespace api {

				using namespace seastar::httpd;

				static std::unique_ptr<reply> exception_reply(std::exception_ptr eptr) {

				    try {

				@@ -165,9 +166,15 @@ future<> set_server_gossip(http_context& ctx, sharded<gms::gossiper>& g) {

				                });

				}

				future<> set_server_load_sstable(http_context& ctx) {

				future<> set_server_load_sstable(http_context& ctx, sharded<db::system_keyspace>& sys_ks) {

				    return register_api(ctx, "column_family",

				                "The column family API", set_column_family);

				                "The column family API", [&sys_ks] (http_context& ctx, routes& r) {

				                    set_column_family(ctx, r, sys_ks);

				                });

				}

				future<> unset_server_load_sstable(http_context& ctx) {

				    return ctx.http_server.set_routes([&ctx] (routes& r) { unset_column_family(ctx, r); });

				}

				future<> set_server_messaging_service(http_context& ctx, sharded<netw::messaging_service>& ms) {

				@@ -187,6 +194,10 @@ future<> set_server_storage_proxy(http_context& ctx, sharded<service::storage_se

				                });

				}

				future<> unset_server_storage_proxy(http_context& ctx) {

				    return ctx.http_server.set_routes([&ctx] (routes& r) { unset_storage_proxy(ctx, r); });

				}

				future<> set_server_stream_manager(http_context& ctx, sharded<streaming::stream_manager>& sm) {

				    return register_api(ctx, "stream_manager",

				                "The stream manager API", [&sm] (http_context& ctx, routes& r) {

				@@ -253,25 +264,25 @@ future<> set_server_done(http_context& ctx) {

				    });

				}

				future<> set_server_task_manager(http_context& ctx) {

				future<> set_server_task_manager(http_context& ctx, lw_shared_ptr<db::config> cfg) {

				    auto rb = std::make_shared < api_registry_builder > (ctx.api_doc);

				    return ctx.http_server.set_routes([rb, &ctx](routes& r) {

				    return ctx.http_server.set_routes([rb, &ctx, &cfg = *cfg](routes& r) {

				        rb->register_function(r, "task_manager",

				                "The task manager API");

				        set_task_manager(ctx, r);

				        set_task_manager(ctx, r, cfg);

				    });

				}

				#ifndef SCYLLA_BUILD_MODE_RELEASE

				future<> set_server_task_manager_test(http_context& ctx, lw_shared_ptr<db::config> cfg) {

				future<> set_server_task_manager_test(http_context& ctx) {

				    auto rb = std::make_shared < api_registry_builder > (ctx.api_doc);

				    return ctx.http_server.set_routes([rb, &ctx, &cfg = *cfg](routes& r) mutable {

				    return ctx.http_server.set_routes([rb, &ctx](routes& r) mutable {

				        rb->register_function(r, "task_manager_test",

				                "The task manager test API");

				        set_task_manager_test(ctx, r, cfg);

				        set_task_manager_test(ctx, r);

				    });

				}

									
										14

api/api.hh
									
												View File
												
				@@ -27,7 +27,7 @@ template<class T>

				std::vector<sstring> container_to_vec(const T& container) {

				    std::vector<sstring> res;

				    for (auto i : container) {

				        res.push_back(boost::lexical_cast<std::string>(i));

				        res.push_back(fmt::to_string(i));

				    }

				    return res;

				}

				@@ -47,8 +47,8 @@ template<class T, class MAP>

				std::vector<T>& map_to_key_value(const MAP& map, std::vector<T>& res) {

				    for (auto i : map) {

				        T val;

				        val.key = boost::lexical_cast<std::string>(i.first);

				        val.value = boost::lexical_cast<std::string>(i.second);

				        val.key = fmt::to_string(i.first);

				        val.value = fmt::to_string(i.second);

				        res.push_back(val);

				    }

				    return res;

				@@ -65,7 +65,7 @@ template <typename MAP>

				std::vector<sstring> map_keys(const MAP& map) {

				    std::vector<sstring> res;

				    for (const auto& i : map) {

				        res.push_back(boost::lexical_cast<std::string>(i.first));

				        res.push_back(fmt::to_string(i.first));

				    }

				    return res;

				}

				@@ -189,7 +189,7 @@ struct basic_ratio_holder : public json::jsonable {

				typedef basic_ratio_holder<double>  ratio_holder;

				typedef basic_ratio_holder<int64_t> integral_ratio_holder;

				class unimplemented_exception : public base_exception {

				class unimplemented_exception : public httpd::base_exception {

				public:

				    unimplemented_exception()

				            : base_exception("API call is not supported yet", reply::status_type::internal_server_error) {

				@@ -238,7 +238,7 @@ public:

				                value = T{boost::lexical_cast<Base>(param)};

				            }

				        } catch (boost::bad_lexical_cast&) {

				            throw bad_param_exception(format("{} ({}): type error - should be {}", name, param, boost::units::detail::demangle(typeid(Base).name())));

				            throw httpd::bad_param_exception(format("{} ({}): type error - should be {}", name, param, boost::units::detail::demangle(typeid(Base).name())));

				        }

				    }

				@@ -306,6 +306,6 @@ public:

				    }

				};

				utils_json::estimated_histogram time_to_json_histogram(const utils::time_estimated_histogram& val);

				httpd::utils_json::estimated_histogram time_to_json_histogram(const utils::time_estimated_histogram& val);

				}

									
										11

api/api_init.hh
									
												View File
												
				@@ -14,6 +14,9 @@

				#include "tasks/task_manager.hh"

				#include "seastarx.hh"

				using request = http::request;

				using reply = http::reply;

				namespace service {

				class load_meter;

				@@ -99,10 +102,12 @@ future<> unset_server_authorization_cache(http_context& ctx);

				future<> set_server_snapshot(http_context& ctx, sharded<db::snapshot_ctl>& snap_ctl);

				future<> unset_server_snapshot(http_context& ctx);

				future<> set_server_gossip(http_context& ctx, sharded<gms::gossiper>& g);

				future<> set_server_load_sstable(http_context& ctx);

				future<> set_server_load_sstable(http_context& ctx, sharded<db::system_keyspace>& sys_ks);

				future<> unset_server_load_sstable(http_context& ctx);

				future<> set_server_messaging_service(http_context& ctx, sharded<netw::messaging_service>& ms);

				future<> unset_server_messaging_service(http_context& ctx);

				future<> set_server_storage_proxy(http_context& ctx, sharded<service::storage_service>& ss);

				future<> unset_server_storage_proxy(http_context& ctx);

				future<> set_server_stream_manager(http_context& ctx, sharded<streaming::stream_manager>& sm);

				future<> unset_server_stream_manager(http_context& ctx);

				future<> set_hinted_handoff(http_context& ctx, sharded<gms::gossiper>& g);

				@@ -111,7 +116,7 @@ future<> set_server_gossip_settle(http_context& ctx, sharded<gms::gossiper>& g);

				future<> set_server_cache(http_context& ctx);

				future<> set_server_compaction_manager(http_context& ctx);

				future<> set_server_done(http_context& ctx);

				future<> set_server_task_manager(http_context& ctx);

				future<> set_server_task_manager_test(http_context& ctx, lw_shared_ptr<db::config> cfg);

				future<> set_server_task_manager(http_context& ctx, lw_shared_ptr<db::config> cfg);

				future<> set_server_task_manager_test(http_context& ctx);

				}

									
										3

api/authorization_cache.cc
									
												View File
												
				@@ -14,9 +14,10 @@

				namespace api {

				using namespace json;

				using namespace seastar::httpd;

				void set_authorization_cache(http_context& ctx, routes& r, sharded<auth::service> &auth_service) {

				    httpd::authorization_cache_json::authorization_cache_reset.set(r, [&auth_service] (std::unique_ptr<request> req) -> future<json::json_return_type> {

				    httpd::authorization_cache_json::authorization_cache_reset.set(r, [&auth_service] (std::unique_ptr<http::request> req) -> future<json::json_return_type> {

				        co_await auth_service.invoke_on_all([] (auth::service& auth) -> future<>  {

				            auth.reset_authorization_cache();

				            return make_ready_future<>();

									
										4

api/authorization_cache.hh
									
												View File
												
				@@ -12,7 +12,7 @@

				namespace api {

				void set_authorization_cache(http_context& ctx, routes& r, sharded<auth::service> &auth_service);

				void unset_authorization_cache(http_context& ctx, routes& r);

				void set_authorization_cache(http_context& ctx, httpd::routes& r, sharded<auth::service> &auth_service);

				void unset_authorization_cache(http_context& ctx, httpd::routes& r);

				}

									
										85

api/cache_service.cc
									
												View File
												
				@@ -12,127 +12,128 @@

				namespace api {

				using namespace json;

				using namespace seastar::httpd;

				namespace cs = httpd::cache_service_json;

				void set_cache_service(http_context& ctx, routes& r) {

				    cs::get_row_cache_save_period_in_seconds.set(r, [](std::unique_ptr<request> req) {

				    cs::get_row_cache_save_period_in_seconds.set(r, [](std::unique_ptr<http::request> req) {

				        // We never save the cache

				        // Origin uses 0 for never

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cs::set_row_cache_save_period_in_seconds.set(r, [](std::unique_ptr<request> req) {

				    cs::set_row_cache_save_period_in_seconds.set(r, [](std::unique_ptr<http::request> req) {

				        // TBD

				        unimplemented();

				        auto period = req->get_query_param("period");

				        return make_ready_future<json::json_return_type>(json_void());

				    });

				    cs::get_key_cache_save_period_in_seconds.set(r, [](std::unique_ptr<request> req) {

				    cs::get_key_cache_save_period_in_seconds.set(r, [](std::unique_ptr<http::request> req) {

				        // We never save the cache

				        // Origin uses 0 for never

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cs::set_key_cache_save_period_in_seconds.set(r, [](std::unique_ptr<request> req) {

				    cs::set_key_cache_save_period_in_seconds.set(r, [](std::unique_ptr<http::request> req) {

				        // TBD

				        unimplemented();

				        auto period = req->get_query_param("period");

				        return make_ready_future<json::json_return_type>(json_void());

				    });

				    cs::get_counter_cache_save_period_in_seconds.set(r, [](std::unique_ptr<request> req) {

				    cs::get_counter_cache_save_period_in_seconds.set(r, [](std::unique_ptr<http::request> req) {

				        // We never save the cache

				        // Origin uses 0 for never

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cs::set_counter_cache_save_period_in_seconds.set(r, [](std::unique_ptr<request> req) {

				    cs::set_counter_cache_save_period_in_seconds.set(r, [](std::unique_ptr<http::request> req) {

				        // TBD

				        unimplemented();

				        auto ccspis = req->get_query_param("ccspis");

				        return make_ready_future<json::json_return_type>(json_void());

				    });

				    cs::get_row_cache_keys_to_save.set(r, [](std::unique_ptr<request> req) {

				    cs::get_row_cache_keys_to_save.set(r, [](std::unique_ptr<http::request> req) {

				        // TBD

				        unimplemented();

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cs::set_row_cache_keys_to_save.set(r, [](std::unique_ptr<request> req) {

				    cs::set_row_cache_keys_to_save.set(r, [](std::unique_ptr<http::request> req) {

				        // TBD

				        unimplemented();

				        auto rckts = req->get_query_param("rckts");

				        return make_ready_future<json::json_return_type>(json_void());

				    });

				    cs::get_key_cache_keys_to_save.set(r, [](std::unique_ptr<request> req) {

				    cs::get_key_cache_keys_to_save.set(r, [](std::unique_ptr<http::request> req) {

				        // TBD

				        unimplemented();

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cs::set_key_cache_keys_to_save.set(r, [](std::unique_ptr<request> req) {

				    cs::set_key_cache_keys_to_save.set(r, [](std::unique_ptr<http::request> req) {

				        // TBD

				        unimplemented();

				        auto kckts = req->get_query_param("kckts");

				        return make_ready_future<json::json_return_type>(json_void());

				    });

				    cs::get_counter_cache_keys_to_save.set(r, [](std::unique_ptr<request> req) {

				    cs::get_counter_cache_keys_to_save.set(r, [](std::unique_ptr<http::request> req) {

				        // TBD

				        unimplemented();

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cs::set_counter_cache_keys_to_save.set(r, [](std::unique_ptr<request> req) {

				    cs::set_counter_cache_keys_to_save.set(r, [](std::unique_ptr<http::request> req) {

				        // TBD

				        unimplemented();

				        auto cckts = req->get_query_param("cckts");

				        return make_ready_future<json::json_return_type>(json_void());

				    });

				    cs::invalidate_key_cache.set(r, [](std::unique_ptr<request> req) {

				    cs::invalidate_key_cache.set(r, [](std::unique_ptr<http::request> req) {

				        // TBD

				        unimplemented();

				        return make_ready_future<json::json_return_type>(json_void());

				    });

				    cs::invalidate_counter_cache.set(r, [](std::unique_ptr<request> req) {

				    cs::invalidate_counter_cache.set(r, [](std::unique_ptr<http::request> req) {

				        // TBD

				        unimplemented();

				        return make_ready_future<json::json_return_type>(json_void());

				    });

				    cs::set_row_cache_capacity_in_mb.set(r, [](std::unique_ptr<request> req) {

				    cs::set_row_cache_capacity_in_mb.set(r, [](std::unique_ptr<http::request> req) {

				        // TBD

				        unimplemented();

				        auto capacity = req->get_query_param("capacity");

				        return make_ready_future<json::json_return_type>(json_void());

				    });

				    cs::set_key_cache_capacity_in_mb.set(r, [](std::unique_ptr<request> req) {

				    cs::set_key_cache_capacity_in_mb.set(r, [](std::unique_ptr<http::request> req) {

				        // TBD

				        unimplemented();

				        auto period = req->get_query_param("period");

				        return make_ready_future<json::json_return_type>(json_void());

				    });

				    cs::set_counter_cache_capacity_in_mb.set(r, [](std::unique_ptr<request> req) {

				    cs::set_counter_cache_capacity_in_mb.set(r, [](std::unique_ptr<http::request> req) {

				        // TBD

				        unimplemented();

				        auto capacity = req->get_query_param("capacity");

				        return make_ready_future<json::json_return_type>(json_void());

				    });

				    cs::save_caches.set(r, [](std::unique_ptr<request> req) {

				    cs::save_caches.set(r, [](std::unique_ptr<http::request> req) {

				        // TBD

				        unimplemented();

				        return make_ready_future<json::json_return_type>(json_void());

				    });

				    cs::get_key_capacity.set(r, [] (std::unique_ptr<request> req) {

				    cs::get_key_capacity.set(r, [] (std::unique_ptr<http::request> req) {

				        // TBD

				        // FIXME

				        // we don't support keys cache,

				@@ -140,7 +141,7 @@ void set_cache_service(http_context& ctx, routes& r) {

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cs::get_key_hits.set(r, [] (std::unique_ptr<request> req) {

				    cs::get_key_hits.set(r, [] (std::unique_ptr<http::request> req) {

				        // TBD

				        // FIXME

				        // we don't support keys cache,

				@@ -148,7 +149,7 @@ void set_cache_service(http_context& ctx, routes& r) {

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cs::get_key_requests.set(r, [] (std::unique_ptr<request> req) {

				    cs::get_key_requests.set(r, [] (std::unique_ptr<http::request> req) {

				        // TBD

				        // FIXME

				        // we don't support keys cache,

				@@ -156,7 +157,7 @@ void set_cache_service(http_context& ctx, routes& r) {

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cs::get_key_hit_rate.set(r, [] (std::unique_ptr<request> req) {

				    cs::get_key_hit_rate.set(r, [] (std::unique_ptr<http::request> req) {

				        // TBD

				        // FIXME

				        // we don't support keys cache,

				@@ -164,21 +165,21 @@ void set_cache_service(http_context& ctx, routes& r) {

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cs::get_key_hits_moving_avrage.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cs::get_key_hits_moving_avrage.set(r, [] (std::unique_ptr<http::request> req) {

				        // TBD

				        // FIXME

				        // See above

				        return make_ready_future<json::json_return_type>(meter_to_json(utils::rate_moving_average()));

				    });

				    cs::get_key_requests_moving_avrage.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cs::get_key_requests_moving_avrage.set(r, [] (std::unique_ptr<http::request> req) {

				        // TBD

				        // FIXME

				        // See above

				        return make_ready_future<json::json_return_type>(meter_to_json(utils::rate_moving_average()));

				    });

				    cs::get_key_size.set(r, [] (std::unique_ptr<request> req) {

				    cs::get_key_size.set(r, [] (std::unique_ptr<http::request> req) {

				        // TBD

				        // FIXME

				        // we don't support keys cache,

				@@ -186,7 +187,7 @@ void set_cache_service(http_context& ctx, routes& r) {

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cs::get_key_entries.set(r, [] (std::unique_ptr<request> req) {

				    cs::get_key_entries.set(r, [] (std::unique_ptr<http::request> req) {

				        // TBD

				        // FIXME

				        // we don't support keys cache,

				@@ -194,7 +195,7 @@ void set_cache_service(http_context& ctx, routes& r) {

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cs::get_row_capacity.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cs::get_row_capacity.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return ctx.db.map_reduce0([](replica::database& db) -> uint64_t {

				            return db.row_cache_tracker().region().occupancy().used_space();

				        }, uint64_t(0), std::plus<uint64_t>()).then([](const int64_t& res) {

				@@ -202,26 +203,26 @@ void set_cache_service(http_context& ctx, routes& r) {

				        });

				    });

				    cs::get_row_hits.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cs::get_row_hits.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, uint64_t(0), [](const replica::column_family& cf) {

				            return cf.get_row_cache().stats().hits.count();

				        }, std::plus<uint64_t>());

				    });

				    cs::get_row_requests.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cs::get_row_requests.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, uint64_t(0), [](const replica::column_family& cf) {

				            return cf.get_row_cache().stats().hits.count() + cf.get_row_cache().stats().misses.count();

				        }, std::plus<uint64_t>());

				    });

				    cs::get_row_hit_rate.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cs::get_row_hit_rate.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, ratio_holder(), [](const replica::column_family& cf) {

				            return ratio_holder(cf.get_row_cache().stats().hits.count() + cf.get_row_cache().stats().misses.count(),

				                    cf.get_row_cache().stats().hits.count());

				        }, std::plus<ratio_holder>());

				    });

				    cs::get_row_hits_moving_avrage.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cs::get_row_hits_moving_avrage.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf_raw(ctx, utils::rate_moving_average(), [](const replica::column_family& cf) {

				            return cf.get_row_cache().stats().hits.rate();

				        }, std::plus<utils::rate_moving_average>()).then([](const utils::rate_moving_average& m) {

				@@ -229,7 +230,7 @@ void set_cache_service(http_context& ctx, routes& r) {

				        });

				    });

				    cs::get_row_requests_moving_avrage.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cs::get_row_requests_moving_avrage.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf_raw(ctx, utils::rate_moving_average(), [](const replica::column_family& cf) {

				            return cf.get_row_cache().stats().hits.rate() + cf.get_row_cache().stats().misses.rate();

				        }, std::plus<utils::rate_moving_average>()).then([](const utils::rate_moving_average& m) {

				@@ -237,7 +238,7 @@ void set_cache_service(http_context& ctx, routes& r) {

				        });

				    });

				    cs::get_row_size.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cs::get_row_size.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        // In origin row size is the weighted size.

				        // We currently do not support weights, so we use num entries instead

				        return ctx.db.map_reduce0([](replica::database& db) -> uint64_t {

				@@ -247,7 +248,7 @@ void set_cache_service(http_context& ctx, routes& r) {

				        });

				    });

				    cs::get_row_entries.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cs::get_row_entries.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return ctx.db.map_reduce0([](replica::database& db) -> uint64_t {

				            return db.row_cache_tracker().partitions();

				        }, uint64_t(0), std::plus<uint64_t>()).then([](const int64_t& res) {

				@@ -255,7 +256,7 @@ void set_cache_service(http_context& ctx, routes& r) {

				        });

				    });

				    cs::get_counter_capacity.set(r, [] (std::unique_ptr<request> req) {

				    cs::get_counter_capacity.set(r, [] (std::unique_ptr<http::request> req) {

				        // TBD

				        // FIXME

				        // we don't support counter cache,

				@@ -263,7 +264,7 @@ void set_cache_service(http_context& ctx, routes& r) {

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cs::get_counter_hits.set(r, [] (std::unique_ptr<request> req) {

				    cs::get_counter_hits.set(r, [] (std::unique_ptr<http::request> req) {

				        // TBD

				        // FIXME

				        // we don't support counter cache,

				@@ -271,7 +272,7 @@ void set_cache_service(http_context& ctx, routes& r) {

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cs::get_counter_requests.set(r, [] (std::unique_ptr<request> req) {

				    cs::get_counter_requests.set(r, [] (std::unique_ptr<http::request> req) {

				        // TBD

				        // FIXME

				        // we don't support counter cache,

				@@ -279,7 +280,7 @@ void set_cache_service(http_context& ctx, routes& r) {

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cs::get_counter_hit_rate.set(r, [] (std::unique_ptr<request> req) {

				    cs::get_counter_hit_rate.set(r, [] (std::unique_ptr<http::request> req) {

				        // TBD

				        // FIXME

				        // we don't support counter cache,

				@@ -287,21 +288,21 @@ void set_cache_service(http_context& ctx, routes& r) {

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cs::get_counter_hits_moving_avrage.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cs::get_counter_hits_moving_avrage.set(r, [] (std::unique_ptr<http::request> req) {

				        // TBD

				        // FIXME

				        // See above

				        return make_ready_future<json::json_return_type>(meter_to_json(utils::rate_moving_average()));

				    });

				    cs::get_counter_requests_moving_avrage.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cs::get_counter_requests_moving_avrage.set(r, [] (std::unique_ptr<http::request> req) {

				        // TBD

				        // FIXME

				        // See above

				        return make_ready_future<json::json_return_type>(meter_to_json(utils::rate_moving_average()));

				    });

				    cs::get_counter_size.set(r, [] (std::unique_ptr<request> req) {

				    cs::get_counter_size.set(r, [] (std::unique_ptr<http::request> req) {

				        // TBD

				        // FIXME

				        // we don't support counter cache,

				@@ -309,7 +310,7 @@ void set_cache_service(http_context& ctx, routes& r) {

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cs::get_counter_entries.set(r, [] (std::unique_ptr<request> req) {

				    cs::get_counter_entries.set(r, [] (std::unique_ptr<http::request> req) {

				        // TBD

				        // FIXME

				        // we don't support counter cache,

									
										2

api/cache_service.hh
									
												View File
												
				@@ -12,6 +12,6 @@

				namespace api {

				void set_cache_service(http_context& ctx, routes& r);

				void set_cache_service(http_context& ctx, httpd::routes& r);

				}

									
										2

api/collectd.cc
									
												View File
												
				@@ -52,7 +52,7 @@ static const char* str_to_regex(const sstring& v) {

				}

				void set_collectd(http_context& ctx, routes& r) {

				    cd::get_collectd.set(r, [&ctx](std::unique_ptr<request> req) {

				    cd::get_collectd.set(r, [](std::unique_ptr<request> req) {

				        auto id = ::make_shared<scollectd::type_instance_id>(req->param["pluginid"],

				                req->get_query_param("instance"), req->get_query_param("type"),

									
										2

api/collectd.hh
									
												View File
												
				@@ -12,6 +12,6 @@

				namespace api {

				void set_collectd(http_context& ctx, routes& r);

				void set_collectd(http_context& ctx, httpd::routes& r);

				}

									
										352

api/column_family.cc
									
												View File
												
				@@ -17,6 +17,7 @@

				#include "db/system_keyspace.hh"

				#include "db/data_listeners.hh"

				#include "storage_service.hh"

				#include "compaction/compaction_manager.hh"

				#include "unimplemented.hh"

				extern logging::logger apilog;

				@@ -24,7 +25,6 @@ extern logging::logger apilog;

				namespace api {

				using namespace httpd;

				using namespace std;

				using namespace json;

				namespace cf = httpd::column_family_json;

				@@ -56,7 +56,7 @@ const table_id& get_uuid(const sstring& name, const replica::database& db) {

				    return get_uuid(ks, cf, db);

				}

				future<> foreach_column_family(http_context& ctx, const sstring& name, function<void(replica::column_family&)> f) {

				future<> foreach_column_family(http_context& ctx, const sstring& name, std::function<void(replica::column_family&)> f) {

				    auto uuid = get_uuid(name, ctx.db.local());

				    return ctx.db.invoke_on_all([f, uuid](replica::database& db) {

				@@ -303,16 +303,16 @@ ratio_holder filter_recent_false_positive_as_ratio_holder(const sstables::shared

				    return ratio_holder(f + sst->filter_get_recent_true_positive(), f);

				}

				void set_column_family(http_context& ctx, routes& r) {

				void set_column_family(http_context& ctx, routes& r, sharded<db::system_keyspace>& sys_ks) {

				    cf::get_column_family_name.set(r, [&ctx] (const_req req){

				        vector<sstring> res;

				        std::vector<sstring> res;

				        for (auto i: ctx.db.local().get_column_families_mapping()) {

				            res.push_back(i.first.first + ":" + i.first.second);

				        }

				        return res;

				    });

				    cf::get_column_family.set(r, [&ctx] (std::unique_ptr<request> req){

				    cf::get_column_family.set(r, [&ctx] (std::unique_ptr<http::request> req){

				            std::list<cf::column_family_info> res;

				            for (auto i: ctx.db.local().get_column_families_mapping()) {

				                cf::column_family_info info;

				@@ -325,22 +325,22 @@ void set_column_family(http_context& ctx, routes& r) {

				        });

				    cf::get_column_family_name_keyspace.set(r, [&ctx] (const_req req){

				        vector<sstring> res;

				        std::vector<sstring> res;

				        for (auto i = ctx.db.local().get_keyspaces().cbegin(); i!=  ctx.db.local().get_keyspaces().cend(); i++) {

				            res.push_back(i->first);

				        }

				        return res;

				    });

				    cf::get_memtable_columns_count.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_memtable_columns_count.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, req->param["name"], uint64_t{0}, [](replica::column_family& cf) {

				            return cf.active_memtable().partition_count();

				            return boost::accumulate(cf.active_memtables() | boost::adaptors::transformed(std::mem_fn(&replica::memtable::partition_count)), uint64_t(0));

				        }, std::plus<>());

				    });

				    cf::get_all_memtable_columns_count.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_memtable_columns_count.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, uint64_t{0}, [](replica::column_family& cf) {

				            return cf.active_memtable().partition_count();

				            return boost::accumulate(cf.active_memtables() | boost::adaptors::transformed(std::mem_fn(&replica::memtable::partition_count)), uint64_t(0));

				        }, std::plus<>());

				    });

				@@ -352,27 +352,35 @@ void set_column_family(http_context& ctx, routes& r) {

				        return 0;

				    });

				    cf::get_memtable_off_heap_size.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_memtable_off_heap_size.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, req->param["name"], int64_t(0), [](replica::column_family& cf) {

				            return cf.active_memtable().region().occupancy().total_space();

				            return boost::accumulate(cf.active_memtables() | boost::adaptors::transformed([] (replica::memtable* active_memtable) {

				                return active_memtable->region().occupancy().total_space();

				            }), uint64_t(0));

				        }, std::plus<int64_t>());

				    });

				    cf::get_all_memtable_off_heap_size.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_memtable_off_heap_size.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, int64_t(0), [](replica::column_family& cf) {

				            return cf.active_memtable().region().occupancy().total_space();

				            return boost::accumulate(cf.active_memtables() | boost::adaptors::transformed([] (replica::memtable* active_memtable) {

				                return active_memtable->region().occupancy().total_space();

				            }), uint64_t(0));

				        }, std::plus<int64_t>());

				    });

				    cf::get_memtable_live_data_size.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_memtable_live_data_size.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, req->param["name"], int64_t(0), [](replica::column_family& cf) {

				            return cf.active_memtable().region().occupancy().used_space();

				            return boost::accumulate(cf.active_memtables() | boost::adaptors::transformed([] (replica::memtable* active_memtable) {

				                return active_memtable->region().occupancy().used_space();

				            }), uint64_t(0));

				        }, std::plus<int64_t>());

				    });

				    cf::get_all_memtable_live_data_size.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_memtable_live_data_size.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, int64_t(0), [](replica::column_family& cf) {

				            return cf.active_memtable().region().occupancy().used_space();

				            return boost::accumulate(cf.active_memtables() | boost::adaptors::transformed([] (replica::memtable* active_memtable) {

				                return active_memtable->region().occupancy().used_space();

				            }), uint64_t(0));

				        }, std::plus<int64_t>());

				    });

				@@ -384,14 +392,14 @@ void set_column_family(http_context& ctx, routes& r) {

				        return 0;

				    });

				    cf::get_cf_all_memtables_off_heap_size.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_cf_all_memtables_off_heap_size.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        warn(unimplemented::cause::INDEXES);

				        return map_reduce_cf(ctx, req->param["name"], int64_t(0), [](replica::column_family& cf) {

				            return cf.occupancy().total_space();

				        }, std::plus<int64_t>());

				    });

				    cf::get_all_cf_all_memtables_off_heap_size.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_cf_all_memtables_off_heap_size.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        warn(unimplemented::cause::INDEXES);

				        return ctx.db.map_reduce0([](const replica::database& db){

				            return db.dirty_memory_region_group().real_memory_used();

				@@ -400,30 +408,32 @@ void set_column_family(http_context& ctx, routes& r) {

				        });

				    });

				    cf::get_cf_all_memtables_live_data_size.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_cf_all_memtables_live_data_size.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        warn(unimplemented::cause::INDEXES);

				        return map_reduce_cf(ctx, req->param["name"], int64_t(0), [](replica::column_family& cf) {

				            return cf.occupancy().used_space();

				        }, std::plus<int64_t>());

				    });

				    cf::get_all_cf_all_memtables_live_data_size.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_cf_all_memtables_live_data_size.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        warn(unimplemented::cause::INDEXES);

				        return map_reduce_cf(ctx, int64_t(0), [](replica::column_family& cf) {

				            return cf.active_memtable().region().occupancy().used_space();

				            return boost::accumulate(cf.active_memtables() | boost::adaptors::transformed([] (replica::memtable* active_memtable) {

				                return active_memtable->region().occupancy().used_space();

				            }), uint64_t(0));

				        }, std::plus<int64_t>());

				    });

				    cf::get_memtable_switch_count.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_memtable_switch_count.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return get_cf_stats(ctx,req->param["name"] ,&replica::column_family_stats::memtable_switch_count);

				    });

				    cf::get_all_memtable_switch_count.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_memtable_switch_count.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return get_cf_stats(ctx, &replica::column_family_stats::memtable_switch_count);

				    });

				    // FIXME: this refers to partitions, not rows.

				    cf::get_estimated_row_size_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_estimated_row_size_histogram.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, req->param["name"], utils::estimated_histogram(0), [](replica::column_family& cf) {

				            utils::estimated_histogram res(0);

				            for (auto sstables = cf.get_sstables(); auto& i : *sstables) {

				@@ -435,7 +445,7 @@ void set_column_family(http_context& ctx, routes& r) {

				    });

				    // FIXME: this refers to partitions, not rows.

				    cf::get_estimated_row_count.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_estimated_row_count.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, req->param["name"], int64_t(0), [](replica::column_family& cf) {

				            uint64_t res = 0;

				            for (auto sstables = cf.get_sstables(); auto& i : *sstables) {

				@@ -446,7 +456,7 @@ void set_column_family(http_context& ctx, routes& r) {

				        std::plus<uint64_t>());

				    });

				    cf::get_estimated_column_count_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_estimated_column_count_histogram.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, req->param["name"], utils::estimated_histogram(0), [](replica::column_family& cf) {

				            utils::estimated_histogram res(0);

				            for (auto sstables = cf.get_sstables(); auto& i : *sstables) {

				@@ -457,149 +467,149 @@ void set_column_family(http_context& ctx, routes& r) {

				        utils::estimated_histogram_merge, utils_json::estimated_histogram());

				    });

				    cf::get_all_compression_ratio.set(r, [] (std::unique_ptr<request> req) {

				    cf::get_all_compression_ratio.set(r, [] (std::unique_ptr<http::request> req) {

				        //TBD

				        unimplemented();

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cf::get_pending_flushes.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_pending_flushes.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return get_cf_stats(ctx,req->param["name"] ,&replica::column_family_stats::pending_flushes);

				    });

				    cf::get_all_pending_flushes.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_pending_flushes.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return get_cf_stats(ctx, &replica::column_family_stats::pending_flushes);

				    });

				    cf::get_read.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_read.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return get_cf_stats_count(ctx,req->param["name"] ,&replica::column_family_stats::reads);

				    });

				    cf::get_all_read.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_read.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return get_cf_stats_count(ctx, &replica::column_family_stats::reads);

				    });

				    cf::get_write.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_write.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return get_cf_stats_count(ctx, req->param["name"] ,&replica::column_family_stats::writes);

				    });

				    cf::get_all_write.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_write.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return get_cf_stats_count(ctx, &replica::column_family_stats::writes);

				    });

				    cf::get_read_latency_histogram_depricated.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_read_latency_histogram_depricated.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return get_cf_histogram(ctx, req->param["name"], &replica::column_family_stats::reads);

				    });

				    cf::get_read_latency_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_read_latency_histogram.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return get_cf_rate_and_histogram(ctx, req->param["name"], &replica::column_family_stats::reads);

				    });

				    cf::get_read_latency.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_read_latency.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return get_cf_stats_sum(ctx,req->param["name"] ,&replica::column_family_stats::reads);

				    });

				    cf::get_write_latency.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_write_latency.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return get_cf_stats_sum(ctx, req->param["name"] ,&replica::column_family_stats::writes);

				    });

				    cf::get_all_read_latency_histogram_depricated.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_read_latency_histogram_depricated.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return get_cf_histogram(ctx, &replica::column_family_stats::writes);

				    });

				    cf::get_all_read_latency_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_read_latency_histogram.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return get_cf_rate_and_histogram(ctx, &replica::column_family_stats::writes);

				    });

				    cf::get_write_latency_histogram_depricated.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_write_latency_histogram_depricated.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return get_cf_histogram(ctx, req->param["name"], &replica::column_family_stats::writes);

				    });

				    cf::get_write_latency_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_write_latency_histogram.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return get_cf_rate_and_histogram(ctx, req->param["name"], &replica::column_family_stats::writes);

				    });

				    cf::get_all_write_latency_histogram_depricated.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_write_latency_histogram_depricated.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return get_cf_histogram(ctx, &replica::column_family_stats::writes);

				    });

				    cf::get_all_write_latency_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_write_latency_histogram.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return get_cf_rate_and_histogram(ctx, &replica::column_family_stats::writes);

				    });

				    cf::get_pending_compactions.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_pending_compactions.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, req->param["name"], int64_t(0), [](replica::column_family& cf) {

				            return cf.get_compaction_strategy().estimated_pending_compactions(cf.as_table_state());

				            return cf.estimate_pending_compactions();

				        }, std::plus<int64_t>());

				    });

				    cf::get_all_pending_compactions.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_pending_compactions.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, int64_t(0), [](replica::column_family& cf) {

				            return cf.get_compaction_strategy().estimated_pending_compactions(cf.as_table_state());

				            return cf.estimate_pending_compactions();

				        }, std::plus<int64_t>());

				    });

				    cf::get_live_ss_table_count.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_live_ss_table_count.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return get_cf_stats(ctx, req->param["name"], &replica::column_family_stats::live_sstable_count);

				    });

				    cf::get_all_live_ss_table_count.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_live_ss_table_count.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return get_cf_stats(ctx, &replica::column_family_stats::live_sstable_count);

				    });

				    cf::get_unleveled_sstables.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_unleveled_sstables.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return get_cf_unleveled_sstables(ctx, req->param["name"]);

				    });

				    cf::get_live_disk_space_used.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_live_disk_space_used.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return sum_sstable(ctx, req->param["name"], false);

				    });

				    cf::get_all_live_disk_space_used.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_live_disk_space_used.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return sum_sstable(ctx, false);

				    });

				    cf::get_total_disk_space_used.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_total_disk_space_used.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return sum_sstable(ctx, req->param["name"], true);

				    });

				    cf::get_all_total_disk_space_used.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_total_disk_space_used.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return sum_sstable(ctx, true);

				    });

				    // FIXME: this refers to partitions, not rows.

				    cf::get_min_row_size.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_min_row_size.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, req->param["name"], INT64_MAX, min_partition_size, min_int64);

				    });

				    // FIXME: this refers to partitions, not rows.

				    cf::get_all_min_row_size.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_min_row_size.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, INT64_MAX, min_partition_size, min_int64);

				    });

				    // FIXME: this refers to partitions, not rows.

				    cf::get_max_row_size.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_max_row_size.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, req->param["name"], int64_t(0), max_partition_size, max_int64);

				    });

				    // FIXME: this refers to partitions, not rows.

				    cf::get_all_max_row_size.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_max_row_size.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, int64_t(0), max_partition_size, max_int64);

				    });

				    // FIXME: this refers to partitions, not rows.

				    cf::get_mean_row_size.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_mean_row_size.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        // Cassandra 3.x mean values are truncated as integrals.

				        return map_reduce_cf(ctx, req->param["name"], integral_ratio_holder(), mean_partition_size, std::plus<integral_ratio_holder>());

				    });

				    // FIXME: this refers to partitions, not rows.

				    cf::get_all_mean_row_size.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_mean_row_size.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        // Cassandra 3.x mean values are truncated as integrals.

				        return map_reduce_cf(ctx, integral_ratio_holder(), mean_partition_size, std::plus<integral_ratio_holder>());

				    });

				    cf::get_bloom_filter_false_positives.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_bloom_filter_false_positives.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, req->param["name"], uint64_t(0), [] (replica::column_family& cf) {

				            auto sstables = cf.get_sstables();

				            return std::accumulate(sstables->begin(), sstables->end(), uint64_t(0), [](uint64_t s, auto& sst) {

				@@ -608,7 +618,7 @@ void set_column_family(http_context& ctx, routes& r) {

				        }, std::plus<uint64_t>());

				    });

				    cf::get_all_bloom_filter_false_positives.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_bloom_filter_false_positives.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, uint64_t(0), [] (replica::column_family& cf) {

				            auto sstables = cf.get_sstables();

				            return std::accumulate(sstables->begin(), sstables->end(), uint64_t(0), [](uint64_t s, auto& sst) {

				@@ -617,7 +627,7 @@ void set_column_family(http_context& ctx, routes& r) {

				        }, std::plus<uint64_t>());

				    });

				    cf::get_recent_bloom_filter_false_positives.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_recent_bloom_filter_false_positives.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, req->param["name"], uint64_t(0), [] (replica::column_family& cf) {

				            auto sstables = cf.get_sstables();

				            return std::accumulate(sstables->begin(), sstables->end(), uint64_t(0), [](uint64_t s, auto& sst) {

				@@ -626,7 +636,7 @@ void set_column_family(http_context& ctx, routes& r) {

				        }, std::plus<uint64_t>());

				    });

				    cf::get_all_recent_bloom_filter_false_positives.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_recent_bloom_filter_false_positives.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, uint64_t(0), [] (replica::column_family& cf) {

				            auto sstables = cf.get_sstables();

				            return std::accumulate(sstables->begin(), sstables->end(), uint64_t(0), [](uint64_t s, auto& sst) {

				@@ -635,31 +645,31 @@ void set_column_family(http_context& ctx, routes& r) {

				        }, std::plus<uint64_t>());

				    });

				    cf::get_bloom_filter_false_ratio.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_bloom_filter_false_ratio.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, req->param["name"], ratio_holder(), [] (replica::column_family& cf) {

				            return boost::accumulate(*cf.get_sstables() | boost::adaptors::transformed(filter_false_positive_as_ratio_holder), ratio_holder());

				        }, std::plus<>());

				    });

				    cf::get_all_bloom_filter_false_ratio.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_bloom_filter_false_ratio.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, ratio_holder(), [] (replica::column_family& cf) {

				            return boost::accumulate(*cf.get_sstables() | boost::adaptors::transformed(filter_false_positive_as_ratio_holder), ratio_holder());

				        }, std::plus<>());

				    });

				    cf::get_recent_bloom_filter_false_ratio.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_recent_bloom_filter_false_ratio.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, req->param["name"], ratio_holder(), [] (replica::column_family& cf) {

				            return boost::accumulate(*cf.get_sstables() | boost::adaptors::transformed(filter_recent_false_positive_as_ratio_holder), ratio_holder());

				        }, std::plus<>());

				    });

				    cf::get_all_recent_bloom_filter_false_ratio.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_recent_bloom_filter_false_ratio.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, ratio_holder(), [] (replica::column_family& cf) {

				            return boost::accumulate(*cf.get_sstables() | boost::adaptors::transformed(filter_recent_false_positive_as_ratio_holder), ratio_holder());

				        }, std::plus<>());

				    });

				    cf::get_bloom_filter_disk_space_used.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_bloom_filter_disk_space_used.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, req->param["name"], uint64_t(0), [] (replica::column_family& cf) {

				            auto sstables = cf.get_sstables();

				            return std::accumulate(sstables->begin(), sstables->end(), uint64_t(0), [](uint64_t s, auto& sst) {

				@@ -668,7 +678,7 @@ void set_column_family(http_context& ctx, routes& r) {

				        }, std::plus<uint64_t>());

				    });

				    cf::get_all_bloom_filter_disk_space_used.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_bloom_filter_disk_space_used.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, uint64_t(0), [] (replica::column_family& cf) {

				            auto sstables = cf.get_sstables();

				            return std::accumulate(sstables->begin(), sstables->end(), uint64_t(0), [](uint64_t s, auto& sst) {

				@@ -677,7 +687,7 @@ void set_column_family(http_context& ctx, routes& r) {

				        }, std::plus<uint64_t>());

				    });

				    cf::get_bloom_filter_off_heap_memory_used.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_bloom_filter_off_heap_memory_used.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, req->param["name"], uint64_t(0), [] (replica::column_family& cf) {

				            auto sstables = cf.get_sstables();

				            return std::accumulate(sstables->begin(), sstables->end(), uint64_t(0), [](uint64_t s, auto& sst) {

				@@ -686,7 +696,7 @@ void set_column_family(http_context& ctx, routes& r) {

				        }, std::plus<uint64_t>());

				    });

				    cf::get_all_bloom_filter_off_heap_memory_used.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_bloom_filter_off_heap_memory_used.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, uint64_t(0), [] (replica::column_family& cf) {

				            auto sstables = cf.get_sstables();

				            return std::accumulate(sstables->begin(), sstables->end(), uint64_t(0), [](uint64_t s, auto& sst) {

				@@ -695,7 +705,7 @@ void set_column_family(http_context& ctx, routes& r) {

				        }, std::plus<uint64_t>());

				    });

				    cf::get_index_summary_off_heap_memory_used.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_index_summary_off_heap_memory_used.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, req->param["name"], uint64_t(0), [] (replica::column_family& cf) {

				            auto sstables = cf.get_sstables();

				            return std::accumulate(sstables->begin(), sstables->end(), uint64_t(0), [](uint64_t s, auto& sst) {

				@@ -704,7 +714,7 @@ void set_column_family(http_context& ctx, routes& r) {

				        }, std::plus<uint64_t>());

				    });

				    cf::get_all_index_summary_off_heap_memory_used.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_index_summary_off_heap_memory_used.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, uint64_t(0), [] (replica::column_family& cf) {

				            auto sstables = cf.get_sstables();

				            return std::accumulate(sstables->begin(), sstables->end(), uint64_t(0), [](uint64_t s, auto& sst) {

				@@ -713,7 +723,7 @@ void set_column_family(http_context& ctx, routes& r) {

				        }, std::plus<uint64_t>());

				    });

				    cf::get_compression_metadata_off_heap_memory_used.set(r, [] (std::unique_ptr<request> req) {

				    cf::get_compression_metadata_off_heap_memory_used.set(r, [] (std::unique_ptr<http::request> req) {

				        //TBD

				        // FIXME

				        // We are missing the off heap memory calculation

				@@ -723,33 +733,33 @@ void set_column_family(http_context& ctx, routes& r) {

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cf::get_all_compression_metadata_off_heap_memory_used.set(r, [] (std::unique_ptr<request> req) {

				    cf::get_all_compression_metadata_off_heap_memory_used.set(r, [] (std::unique_ptr<http::request> req) {

				        //TBD

				        unimplemented();

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cf::get_speculative_retries.set(r, [] (std::unique_ptr<request> req) {

				    cf::get_speculative_retries.set(r, [] (std::unique_ptr<http::request> req) {

				        //TBD

				        unimplemented();

				        //auto id = get_uuid(req->param["name"], ctx.db.local());

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cf::get_all_speculative_retries.set(r, [] (std::unique_ptr<request> req) {

				    cf::get_all_speculative_retries.set(r, [] (std::unique_ptr<http::request> req) {

				        //TBD

				        unimplemented();

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cf::get_key_cache_hit_rate.set(r, [] (std::unique_ptr<request> req) {

				    cf::get_key_cache_hit_rate.set(r, [] (std::unique_ptr<http::request> req) {

				        //TBD

				        unimplemented();

				        //auto id = get_uuid(req->param["name"], ctx.db.local());

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cf::get_true_snapshots_size.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_true_snapshots_size.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        auto uuid = get_uuid(req->param["name"], ctx.db.local());

				        return ctx.db.local().find_column_family(uuid).get_snapshot_details().then([](

				                const std::unordered_map<sstring, replica::column_family::snapshot_details>& sd) {

				@@ -761,26 +771,26 @@ void set_column_family(http_context& ctx, routes& r) {

				        });

				    });

				    cf::get_all_true_snapshots_size.set(r, [] (std::unique_ptr<request> req) {

				    cf::get_all_true_snapshots_size.set(r, [] (std::unique_ptr<http::request> req) {

				        //TBD

				        unimplemented();

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cf::get_row_cache_hit_out_of_range.set(r, [] (std::unique_ptr<request> req) {

				    cf::get_row_cache_hit_out_of_range.set(r, [] (std::unique_ptr<http::request> req) {

				        //TBD

				        unimplemented();

				        //auto id = get_uuid(req->param["name"], ctx.db.local());

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cf::get_all_row_cache_hit_out_of_range.set(r, [] (std::unique_ptr<request> req) {

				    cf::get_all_row_cache_hit_out_of_range.set(r, [] (std::unique_ptr<http::request> req) {

				        //TBD

				        unimplemented();

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cf::get_row_cache_hit.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_row_cache_hit.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf_raw(ctx, req->param["name"], utils::rate_moving_average(), [](const replica::column_family& cf) {

				            return cf.get_row_cache().stats().hits.rate();

				        }, std::plus<utils::rate_moving_average>()).then([](const utils::rate_moving_average& m) {

				@@ -788,7 +798,7 @@ void set_column_family(http_context& ctx, routes& r) {

				        });

				    });

				    cf::get_all_row_cache_hit.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_row_cache_hit.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf_raw(ctx, utils::rate_moving_average(), [](const replica::column_family& cf) {

				            return cf.get_row_cache().stats().hits.rate();

				        }, std::plus<utils::rate_moving_average>()).then([](const utils::rate_moving_average& m) {

				@@ -796,7 +806,7 @@ void set_column_family(http_context& ctx, routes& r) {

				        });

				    });

				    cf::get_row_cache_miss.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_row_cache_miss.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf_raw(ctx, req->param["name"], utils::rate_moving_average(), [](const replica::column_family& cf) {

				            return cf.get_row_cache().stats().misses.rate();

				        }, std::plus<utils::rate_moving_average>()).then([](const utils::rate_moving_average& m) {

				@@ -804,7 +814,7 @@ void set_column_family(http_context& ctx, routes& r) {

				        });

				    });

				    cf::get_all_row_cache_miss.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_all_row_cache_miss.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf_raw(ctx, utils::rate_moving_average(), [](const replica::column_family& cf) {

				            return cf.get_row_cache().stats().misses.rate();

				        }, std::plus<utils::rate_moving_average>()).then([](const utils::rate_moving_average& m) {

				@@ -813,40 +823,40 @@ void set_column_family(http_context& ctx, routes& r) {

				    });

				    cf::get_cas_prepare.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_cas_prepare.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf_time_histogram(ctx, req->param["name"], [](const replica::column_family& cf) {

				            return cf.get_stats().cas_prepare.histogram();

				        });

				    });

				    cf::get_cas_propose.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_cas_propose.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf_time_histogram(ctx, req->param["name"], [](const replica::column_family& cf) {

				            return cf.get_stats().cas_accept.histogram();

				        });

				    });

				    cf::get_cas_commit.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_cas_commit.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf_time_histogram(ctx, req->param["name"], [](const replica::column_family& cf) {

				            return cf.get_stats().cas_learn.histogram();

				        });

				    });

				    cf::get_sstables_per_read_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_sstables_per_read_histogram.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, req->param["name"], utils::estimated_histogram(0), [](replica::column_family& cf) {

				            return cf.get_stats().estimated_sstable_per_read;

				        },

				        utils::estimated_histogram_merge, utils_json::estimated_histogram());

				    });

				    cf::get_tombstone_scanned_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_tombstone_scanned_histogram.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return get_cf_histogram(ctx, req->param["name"], &replica::column_family_stats::tombstone_scanned);

				    });

				    cf::get_live_scanned_histogram.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::get_live_scanned_histogram.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return get_cf_histogram(ctx, req->param["name"], &replica::column_family_stats::live_scanned);

				    });

				    cf::get_col_update_time_delta_histogram.set(r, [] (std::unique_ptr<request> req) {

				    cf::get_col_update_time_delta_histogram.set(r, [] (std::unique_ptr<http::request> req) {

				        //TBD

				        unimplemented();

				        //auto id = get_uuid(req->param["name"], ctx.db.local());

				@@ -860,7 +870,7 @@ void set_column_family(http_context& ctx, routes& r) {

				        return !cf.is_auto_compaction_disabled_by_user();

				    });

				    cf::enable_auto_compaction.set(r, [&ctx](std::unique_ptr<request> req) {

				    cf::enable_auto_compaction.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return ctx.db.invoke_on(0, [&ctx, req = std::move(req)] (replica::database& db) {

				            auto g = replica::database::autocompaction_toggle_guard(db);

				            return foreach_column_family(ctx, req->param["name"], [](replica::column_family &cf) {

				@@ -871,7 +881,7 @@ void set_column_family(http_context& ctx, routes& r) {

				        });

				    });

				    cf::disable_auto_compaction.set(r, [&ctx](std::unique_ptr<request> req) {

				    cf::disable_auto_compaction.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return ctx.db.invoke_on(0, [&ctx, req = std::move(req)] (replica::database& db) {

				            auto g = replica::database::autocompaction_toggle_guard(db);

				            return foreach_column_family(ctx, req->param["name"], [](replica::column_family &cf) {

				@@ -882,11 +892,11 @@ void set_column_family(http_context& ctx, routes& r) {

				        });

				    });

				    cf::get_built_indexes.set(r, [&ctx](std::unique_ptr<request> req) {

				    cf::get_built_indexes.set(r, [&ctx, &sys_ks](std::unique_ptr<http::request> req) {

				        auto ks_cf = parse_fully_qualified_cf_name(req->param["name"]);

				        auto&& ks = std::get<0>(ks_cf);

				        auto&& cf_name = std::get<1>(ks_cf);

				        return db::system_keyspace::load_view_build_progress().then([ks, cf_name, &ctx](const std::vector<db::system_keyspace_view_build_progress>& vb) mutable {

				        return sys_ks.local().load_view_build_progress().then([ks, cf_name, &ctx](const std::vector<db::system_keyspace_view_build_progress>& vb) mutable {

				            std::set<sstring> vp;

				            for (auto b : vb) {

				                if (b.view.first == ks) {

				@@ -920,7 +930,7 @@ void set_column_family(http_context& ctx, routes& r) {

				        return std::vector<sstring>();

				    });

				    cf::get_compression_ratio.set(r, [&ctx](std::unique_ptr<request> req) {

				    cf::get_compression_ratio.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        auto uuid = get_uuid(req->param["name"], ctx.db.local());

				        return ctx.db.map_reduce(sum_ratio<double>(), [uuid](replica::database& db) {

				@@ -931,19 +941,19 @@ void set_column_family(http_context& ctx, routes& r) {

				        });

				    });

				    cf::get_read_latency_estimated_histogram.set(r, [&ctx](std::unique_ptr<request> req) {

				    cf::get_read_latency_estimated_histogram.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return map_reduce_cf_time_histogram(ctx, req->param["name"], [](const replica::column_family& cf) {

				            return cf.get_stats().reads.histogram();

				        });

				    });

				    cf::get_write_latency_estimated_histogram.set(r, [&ctx](std::unique_ptr<request> req) {

				    cf::get_write_latency_estimated_histogram.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return map_reduce_cf_time_histogram(ctx, req->param["name"], [](const replica::column_family& cf) {

				            return cf.get_stats().writes.histogram();

				        });

				    });

				    cf::set_compaction_strategy_class.set(r, [&ctx](std::unique_ptr<request> req) {

				    cf::set_compaction_strategy_class.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        sstring strategy = req->get_query_param("class_name");

				        return foreach_column_family(ctx, req->param["name"], [strategy](replica::column_family& cf) {

				            cf.set_compaction_strategy(sstables::compaction_strategy::type(strategy));

				@@ -956,19 +966,19 @@ void set_column_family(http_context& ctx, routes& r) {

				        return ctx.db.local().find_column_family(get_uuid(req.param["name"], ctx.db.local())).get_compaction_strategy().name();

				    });

				    cf::set_compression_parameters.set(r, [&ctx](std::unique_ptr<request> req) {

				    cf::set_compression_parameters.set(r, [](std::unique_ptr<http::request> req) {

				        // TBD

				        unimplemented();

				        return make_ready_future<json::json_return_type>(json_void());

				    });

				    cf::set_crc_check_chance.set(r, [&ctx](std::unique_ptr<request> req) {

				    cf::set_crc_check_chance.set(r, [](std::unique_ptr<http::request> req) {

				        // TBD

				        unimplemented();

				        return make_ready_future<json::json_return_type>(json_void());

				    });

				    cf::get_sstable_count_per_level.set(r, [&ctx](std::unique_ptr<request> req) {

				    cf::get_sstable_count_per_level.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return map_reduce_cf_raw(ctx, req->param["name"], std::vector<uint64_t>(), [](const replica::column_family& cf) {

				            return cf.sstable_count_per_level();

				        }, concat_sstable_count_per_level).then([](const std::vector<uint64_t>& res) {

				@@ -976,7 +986,7 @@ void set_column_family(http_context& ctx, routes& r) {

				        });

				    });

				    cf::get_sstables_for_key.set(r, [&ctx](std::unique_ptr<request> req) {

				    cf::get_sstables_for_key.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        auto key = req->get_query_param("key");

				        auto uuid = get_uuid(req->param["name"], ctx.db.local());

				@@ -992,7 +1002,7 @@ void set_column_family(http_context& ctx, routes& r) {

				    });

				    cf::toppartitions.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cf::toppartitions.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        auto name = req->param["name"];

				        auto [ks, cf] = parse_fully_qualified_cf_name(name);

				@@ -1008,15 +1018,127 @@ void set_column_family(http_context& ctx, routes& r) {

				        });

				    });

				    cf::force_major_compaction.set(r, [&ctx](std::unique_ptr<request> req) {

				    cf::force_major_compaction.set(r, [&ctx](std::unique_ptr<http::request> req) -> future<json::json_return_type> {

				        if (req->get_query_param("split_output") != "") {

				            fail(unimplemented::cause::API);

				        }

				        return foreach_column_family(ctx, req->param["name"], [](replica::column_family &cf) {

				            return cf.compact_all_sstables();

				        }).then([] {

				            return make_ready_future<json::json_return_type>(json_void());

				        });

				        auto [ks, cf] = parse_fully_qualified_cf_name(req->param["name"]);

				        auto keyspace = validate_keyspace(ctx, ks);

				        std::vector<table_id> table_infos = {ctx.db.local().find_uuid(ks, cf)};

				        auto& compaction_module = ctx.db.local().get_compaction_manager().get_task_manager_module();

				        auto task = co_await compaction_module.make_and_start_task<major_keyspace_compaction_task_impl>({}, std::move(keyspace), ctx.db, std::move(table_infos));

				        co_await task->done();

				        co_return json_void();

				    });

				}

				void unset_column_family(http_context& ctx, routes& r) {

				    cf::get_column_family_name.unset(r);

				    cf::get_column_family.unset(r);

				    cf::get_column_family_name_keyspace.unset(r);

				    cf::get_memtable_columns_count.unset(r);

				    cf::get_all_memtable_columns_count.unset(r);

				    cf::get_memtable_on_heap_size.unset(r);

				    cf::get_all_memtable_on_heap_size.unset(r);

				    cf::get_memtable_off_heap_size.unset(r);

				    cf::get_all_memtable_off_heap_size.unset(r);

				    cf::get_memtable_live_data_size.unset(r);

				    cf::get_all_memtable_live_data_size.unset(r);

				    cf::get_cf_all_memtables_on_heap_size.unset(r);

				    cf::get_all_cf_all_memtables_on_heap_size.unset(r);

				    cf::get_cf_all_memtables_off_heap_size.unset(r);

				    cf::get_all_cf_all_memtables_off_heap_size.unset(r);

				    cf::get_cf_all_memtables_live_data_size.unset(r);

				    cf::get_all_cf_all_memtables_live_data_size.unset(r);

				    cf::get_memtable_switch_count.unset(r);

				    cf::get_all_memtable_switch_count.unset(r);

				    cf::get_estimated_row_size_histogram.unset(r);

				    cf::get_estimated_row_count.unset(r);

				    cf::get_estimated_column_count_histogram.unset(r);

				    cf::get_all_compression_ratio.unset(r);

				    cf::get_pending_flushes.unset(r);

				    cf::get_all_pending_flushes.unset(r);

				    cf::get_read.unset(r);

				    cf::get_all_read.unset(r);

				    cf::get_write.unset(r);

				    cf::get_all_write.unset(r);

				    cf::get_read_latency_histogram_depricated.unset(r);

				    cf::get_read_latency_histogram.unset(r);

				    cf::get_read_latency.unset(r);

				    cf::get_write_latency.unset(r);

				    cf::get_all_read_latency_histogram_depricated.unset(r);

				    cf::get_all_read_latency_histogram.unset(r);

				    cf::get_write_latency_histogram_depricated.unset(r);

				    cf::get_write_latency_histogram.unset(r);

				    cf::get_all_write_latency_histogram_depricated.unset(r);

				    cf::get_all_write_latency_histogram.unset(r);

				    cf::get_pending_compactions.unset(r);

				    cf::get_all_pending_compactions.unset(r);

				    cf::get_live_ss_table_count.unset(r);

				    cf::get_all_live_ss_table_count.unset(r);

				    cf::get_unleveled_sstables.unset(r);

				    cf::get_live_disk_space_used.unset(r);

				    cf::get_all_live_disk_space_used.unset(r);

				    cf::get_total_disk_space_used.unset(r);

				    cf::get_all_total_disk_space_used.unset(r);

				    cf::get_min_row_size.unset(r);

				    cf::get_all_min_row_size.unset(r);

				    cf::get_max_row_size.unset(r);

				    cf::get_all_max_row_size.unset(r);

				    cf::get_mean_row_size.unset(r);

				    cf::get_all_mean_row_size.unset(r);

				    cf::get_bloom_filter_false_positives.unset(r);

				    cf::get_all_bloom_filter_false_positives.unset(r);

				    cf::get_recent_bloom_filter_false_positives.unset(r);

				    cf::get_all_recent_bloom_filter_false_positives.unset(r);

				    cf::get_bloom_filter_false_ratio.unset(r);

				    cf::get_all_bloom_filter_false_ratio.unset(r);

				    cf::get_recent_bloom_filter_false_ratio.unset(r);

				    cf::get_all_recent_bloom_filter_false_ratio.unset(r);

				    cf::get_bloom_filter_disk_space_used.unset(r);

				    cf::get_all_bloom_filter_disk_space_used.unset(r);

				    cf::get_bloom_filter_off_heap_memory_used.unset(r);

				    cf::get_all_bloom_filter_off_heap_memory_used.unset(r);

				    cf::get_index_summary_off_heap_memory_used.unset(r);

				    cf::get_all_index_summary_off_heap_memory_used.unset(r);

				    cf::get_compression_metadata_off_heap_memory_used.unset(r);

				    cf::get_all_compression_metadata_off_heap_memory_used.unset(r);

				    cf::get_speculative_retries.unset(r);

				    cf::get_all_speculative_retries.unset(r);

				    cf::get_key_cache_hit_rate.unset(r);

				    cf::get_true_snapshots_size.unset(r);

				    cf::get_all_true_snapshots_size.unset(r);

				    cf::get_row_cache_hit_out_of_range.unset(r);

				    cf::get_all_row_cache_hit_out_of_range.unset(r);

				    cf::get_row_cache_hit.unset(r);

				    cf::get_all_row_cache_hit.unset(r);

				    cf::get_row_cache_miss.unset(r);

				    cf::get_all_row_cache_miss.unset(r);

				    cf::get_cas_prepare.unset(r);

				    cf::get_cas_propose.unset(r);

				    cf::get_cas_commit.unset(r);

				    cf::get_sstables_per_read_histogram.unset(r);

				    cf::get_tombstone_scanned_histogram.unset(r);

				    cf::get_live_scanned_histogram.unset(r);

				    cf::get_col_update_time_delta_histogram.unset(r);

				    cf::get_auto_compaction.unset(r);

				    cf::enable_auto_compaction.unset(r);

				    cf::disable_auto_compaction.unset(r);

				    cf::get_built_indexes.unset(r);

				    cf::get_compression_metadata_off_heap_memory_used.unset(r);

				    cf::get_compression_parameters.unset(r);

				    cf::get_compression_ratio.unset(r);

				    cf::get_read_latency_estimated_histogram.unset(r);

				    cf::get_write_latency_estimated_histogram.unset(r);

				    cf::set_compaction_strategy_class.unset(r);

				    cf::get_compaction_strategy_class.unset(r);

				    cf::set_compression_parameters.unset(r);

				    cf::set_crc_check_chance.unset(r);

				    cf::get_sstable_count_per_level.unset(r);

				    cf::get_sstables_for_key.unset(r);

				    cf::toppartitions.unset(r);

				    cf::force_major_compaction.unset(r);

				}

				}

									
										7

api/column_family.hh
									
												View File
												
				@@ -14,9 +14,14 @@

				#include <seastar/core/future-util.hh>

				#include <any>

				namespace db {

				class system_keyspace;

				}

				namespace api {

				void set_column_family(http_context& ctx, routes& r);

				void set_column_family(http_context& ctx, httpd::routes& r, sharded<db::system_keyspace>& sys_ks);

				void unset_column_family(http_context& ctx, httpd::routes& r);

				const table_id& get_uuid(const sstring& name, const replica::database& db);

				future<> foreach_column_family(http_context& ctx, const sstring& name, std::function<void(replica::column_family&)> f);

									
										1

api/commitlog.cc
									
												View File
												
				@@ -13,6 +13,7 @@

				#include <vector>

				namespace api {

				using namespace seastar::httpd;

				template<typename T>

				static auto acquire_cl_metric(http_context& ctx, std::function<T (db::commitlog*)> func) {

									
										2

api/commitlog.hh
									
												View File
												
				@@ -12,6 +12,6 @@

				namespace api {

				void set_commitlog(http_context& ctx, routes& r);

				void set_commitlog(http_context& ctx, httpd::routes& r);

				}

									
										46

api/compaction_manager.cc
									
												View File
												
				@@ -22,6 +22,7 @@ namespace api {

				namespace cm = httpd::compaction_manager_json;

				using namespace json;

				using namespace seastar::httpd;

				static future<json::json_return_type> get_cm_stats(http_context& ctx,

				        int64_t compaction_manager::stats::*f) {

				@@ -41,9 +42,8 @@ static std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_ha

				    return std::move(a);

				}

				void set_compaction_manager(http_context& ctx, routes& r) {

				    cm::get_compactions.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cm::get_compactions.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return ctx.db.map_reduce0([](replica::database& db) {

				            std::vector<cm::summary> summaries;

				            const compaction_manager& cm = db.get_compaction_manager();

				@@ -65,12 +65,12 @@ void set_compaction_manager(http_context& ctx, routes& r) {

				        });

				    });

				    cm::get_pending_tasks_by_table.set(r, [&ctx] (std::unique_ptr<request> req) {

				        return ctx.db.map_reduce0([&ctx](replica::database& db) {

				            return do_with(std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash>(), [&ctx, &db](std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash>& tasks) {

				                return do_for_each(db.get_column_families(), [&tasks](const std::pair<table_id, seastar::lw_shared_ptr<replica::table>>& i) {

				    cm::get_pending_tasks_by_table.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return ctx.db.map_reduce0([](replica::database& db) {

				            return do_with(std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash>(), [&db](std::unordered_map<std::pair<sstring, sstring>, uint64_t, utils::tuple_hash>& tasks) {

				                return do_for_each(db.get_column_families(), [&tasks](const std::pair<table_id, seastar::lw_shared_ptr<replica::table>>& i) -> future<> {

				                    replica::table& cf = *i.second.get();

				                    tasks[std::make_pair(cf.schema()->ks_name(), cf.schema()->cf_name())] = cf.get_compaction_strategy().estimated_pending_compactions(cf.as_table_state());

				                    tasks[std::make_pair(cf.schema()->ks_name(), cf.schema()->cf_name())] = cf.estimate_pending_compactions();

				                    return make_ready_future<>();

				                }).then([&tasks] {

				                    return std::move(tasks);

				@@ -91,14 +91,14 @@ void set_compaction_manager(http_context& ctx, routes& r) {

				        });

				    });

				    cm::force_user_defined_compaction.set(r, [] (std::unique_ptr<request> req) {

				    cm::force_user_defined_compaction.set(r, [] (std::unique_ptr<http::request> req) {

				        //TBD

				        // FIXME

				        warn(unimplemented::cause::API);

				        return make_ready_future<json::json_return_type>(json_void());

				    });

				    cm::stop_compaction.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cm::stop_compaction.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        auto type = req->get_query_param("type");

				        return ctx.db.invoke_on_all([type] (replica::database& db) {

				            auto& cm = db.get_compaction_manager();

				@@ -108,7 +108,7 @@ void set_compaction_manager(http_context& ctx, routes& r) {

				        });

				    });

				    cm::stop_keyspace_compaction.set(r, [&ctx] (std::unique_ptr<request> req) -> future<json::json_return_type> {

				    cm::stop_keyspace_compaction.set(r, [&ctx] (std::unique_ptr<http::request> req) -> future<json::json_return_type> {

				        auto ks_name = validate_keyspace(ctx, req->param);

				        auto table_names = parse_tables(ks_name, ctx, req->query_parameters, "tables");

				        if (table_names.empty()) {

				@@ -119,41 +119,43 @@ void set_compaction_manager(http_context& ctx, routes& r) {

				            auto& cm = db.get_compaction_manager();

				            return parallel_for_each(table_names, [&db, &cm, &ks_name, type] (sstring& table_name) {

				                auto& t = db.find_column_family(ks_name, table_name);

				                return cm.stop_compaction(type, &t.as_table_state());

				                return t.parallel_foreach_table_state([&] (compaction::table_state& ts) {

				                    return cm.stop_compaction(type, &ts);

				                });

				            });

				        });

				        co_return json_void();

				    });

				    cm::get_pending_tasks.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cm::get_pending_tasks.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return map_reduce_cf(ctx, int64_t(0), [](replica::column_family& cf) {

				            return cf.get_compaction_strategy().estimated_pending_compactions(cf.as_table_state());

				            return cf.estimate_pending_compactions();

				        }, std::plus<int64_t>());

				    });

				    cm::get_completed_tasks.set(r, [&ctx] (std::unique_ptr<request> req) {

				    cm::get_completed_tasks.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        return get_cm_stats(ctx, &compaction_manager::stats::completed_tasks);

				    });

				    cm::get_total_compactions_completed.set(r, [] (std::unique_ptr<request> req) {

				    cm::get_total_compactions_completed.set(r, [] (std::unique_ptr<http::request> req) {

				        // FIXME

				        // We are currently dont have an API for compaction

				        // so returning a 0 as the number of total compaction is ok

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cm::get_bytes_compacted.set(r, [] (std::unique_ptr<request> req) {

				    cm::get_bytes_compacted.set(r, [] (std::unique_ptr<http::request> req) {

				        //TBD

				        // FIXME

				        warn(unimplemented::cause::API);

				        return make_ready_future<json::json_return_type>(0);

				    });

				    cm::get_compaction_history.set(r, [] (std::unique_ptr<request> req) {

				        std::function<future<>(output_stream<char>&&)> f = [](output_stream<char>&& s) {

				            return do_with(output_stream<char>(std::move(s)), true, [] (output_stream<char>& s, bool& first){

				                return s.write("[").then([&s, &first] {

				                    return db::system_keyspace::get_compaction_history([&s, &first](const db::system_keyspace::compaction_history_entry& entry) mutable {

				    cm::get_compaction_history.set(r, [&ctx] (std::unique_ptr<http::request> req) {

				        std::function<future<>(output_stream<char>&&)> f = [&ctx](output_stream<char>&& s) {

				            return do_with(output_stream<char>(std::move(s)), true, [&ctx] (output_stream<char>& s, bool& first){

				                return s.write("[").then([&ctx, &s, &first] {

				                    return ctx.db.local().get_compaction_manager().get_compaction_history([&s, &first](const db::compaction_history_entry& entry) mutable {

				                        cm::history h;

				                        h.id = entry.id.to_sstring();

				                        h.ks = std::move(entry.ks);

				@@ -183,7 +185,7 @@ void set_compaction_manager(http_context& ctx, routes& r) {

				        return make_ready_future<json::json_return_type>(std::move(f));

				    });

				    cm::get_compaction_info.set(r, [] (std::unique_ptr<request> req) {

				    cm::get_compaction_info.set(r, [] (std::unique_ptr<http::request> req) {

				        //TBD

				        // FIXME

				        warn(unimplemented::cause::API);

									
										2

api/compaction_manager.hh
									
												View File
												
				@@ -12,6 +12,6 @@

				namespace api {

				void set_compaction_manager(http_context& ctx, routes& r);

				void set_compaction_manager(http_context& ctx, httpd::routes& r);

				}

									
										1

api/config.cc
									
												View File
												
				@@ -13,6 +13,7 @@

				#include <boost/algorithm/string/replace.hpp>

				namespace api {

				using namespace seastar::httpd;

				template<class T>

				json::json_return_type get_json_return_type(const T& val) {

									
										2

api/config.hh
									
												View File
												
				@@ -13,5 +13,5 @@

				namespace api {

				void set_config(std::shared_ptr<api_registry_builder20> rb, http_context& ctx, routes& r, const db::config& cfg);

				void set_config(std::shared_ptr<httpd::api_registry_builder20> rb, http_context& ctx, httpd::routes& r, const db::config& cfg);

				}

									
										9

api/endpoint_snitch.cc
									
												View File
												
				@@ -15,6 +15,7 @@

				#include "utils/fb_utilities.hh"

				namespace api {

				using namespace seastar::httpd;

				void set_endpoint_snitch(http_context& ctx, routes& r, sharded<locator::snitch_ptr>& snitch) {

				    static auto host_or_broadcast = [](const_req req) {

				@@ -25,10 +26,10 @@ void set_endpoint_snitch(http_context& ctx, routes& r, sharded<locator::snitch_p

				    httpd::endpoint_snitch_info_json::get_datacenter.set(r, [&ctx](const_req req) {

				        auto& topology = ctx.shared_token_metadata.local().get()->get_topology();

				        auto ep = host_or_broadcast(req);

				        if (!topology.has_endpoint(ep, locator::topology::pending::yes)) {

				        if (!topology.has_endpoint(ep)) {

				            // Cannot return error here, nodetool status can race, request

				            // info about just-left node and not handle it nicely

				            return sstring(locator::production_snitch_base::default_dc);

				            return locator::endpoint_dc_rack::default_location.dc;

				        }

				        return topology.get_datacenter(ep);

				    });

				@@ -36,10 +37,10 @@ void set_endpoint_snitch(http_context& ctx, routes& r, sharded<locator::snitch_p

				    httpd::endpoint_snitch_info_json::get_rack.set(r, [&ctx](const_req req) {

				        auto& topology = ctx.shared_token_metadata.local().get()->get_topology();

				        auto ep = host_or_broadcast(req);

				        if (!topology.has_endpoint(ep, locator::topology::pending::yes)) {

				        if (!topology.has_endpoint(ep)) {

				            // Cannot return error here, nodetool status can race, request

				            // info about just-left node and not handle it nicely

				            return sstring(locator::production_snitch_base::default_rack);

				            return locator::endpoint_dc_rack::default_location.rack;

				        }

				        return topology.get_rack(ep);

				    });

									
										4

api/endpoint_snitch.hh
									
												View File
												
				@@ -16,7 +16,7 @@ class snitch_ptr;

				namespace api {

				void set_endpoint_snitch(http_context& ctx, routes& r, sharded<locator::snitch_ptr>&);

				void unset_endpoint_snitch(http_context& ctx, routes& r);

				void set_endpoint_snitch(http_context& ctx, httpd::routes& r, sharded<locator::snitch_ptr>&);

				void unset_endpoint_snitch(http_context& ctx, httpd::routes& r);

				}

									
										1

api/error_injection.cc
									
												View File
												
				@@ -15,6 +15,7 @@

				#include <seastar/core/future-util.hh>

				namespace api {

				using namespace seastar::httpd;

				namespace hf = httpd::error_injection_json;

									
										2

api/error_injection.hh
									
												View File
												
				@@ -12,6 +12,6 @@

				namespace api {

				void set_error_injection(http_context& ctx, routes& r);

				void set_error_injection(http_context& ctx, httpd::routes& r);

				}

									
										27

api/failure_detector.cc
									
												View File
												
				@@ -8,10 +8,11 @@

				#include "failure_detector.hh"

				#include "api/api-doc/failure_detector.json.hh"

				#include "gms/failure_detector.hh"

				#include "gms/application_state.hh"

				#include "gms/gossiper.hh"

				namespace api {

				using namespace seastar::httpd;

				namespace fd = httpd::failure_detector_json;

				@@ -20,18 +21,18 @@ void set_failure_detector(http_context& ctx, routes& r, gms::gossiper& g) {

				        std::vector<fd::endpoint_state> res;

				        for (auto i : g.get_endpoint_states()) {

				            fd::endpoint_state val;

				            val.addrs = boost::lexical_cast<std::string>(i.first);

				            val.addrs = fmt::to_string(i.first);

				            val.is_alive = i.second.is_alive();

				            val.generation = i.second.get_heart_beat_state().get_generation();

				            val.version = i.second.get_heart_beat_state().get_heart_beat_version();

				            val.generation = i.second.get_heart_beat_state().get_generation().value();

				            val.version = i.second.get_heart_beat_state().get_heart_beat_version().value();

				            val.update_time = i.second.get_update_timestamp().time_since_epoch().count();

				            for (auto a : i.second.get_application_state_map()) {

				                fd::version_value version_val;

				                // We return the enum index and not it's name to stay compatible to origin

				                // method that the state index are static but the name can be changed.

				                version_val.application_state = static_cast<std::underlying_type<gms::application_state>::type>(a.first);

				                version_val.value = a.second.value;

				                version_val.version = a.second.version;

				                version_val.value = a.second.value();

				                version_val.version = a.second.version().value();

				                val.application_state.push(version_val);

				            }

				            res.push_back(val);

				@@ -62,7 +63,9 @@ void set_failure_detector(http_context& ctx, routes& r, gms::gossiper& g) {

				    });

				    fd::set_phi_convict_threshold.set(r, [](std::unique_ptr<request> req) {

				        double phi = atof(req->get_query_param("phi").c_str());

				        // TBD

				        unimplemented();

				        std::ignore = atof(req->get_query_param("phi").c_str());

				        return make_ready_future<json::json_return_type>("");

				    });

				@@ -77,15 +80,9 @@ void set_failure_detector(http_context& ctx, routes& r, gms::gossiper& g) {

				    });

				    fd::get_endpoint_phi_values.set(r, [](std::unique_ptr<request> req) {

				        std::map<gms::inet_address, gms::arrival_window> map;

				        // We no longer have a phi failure detector,

				        // just returning the empty value is good enough.

				        std::vector<fd::endpoint_phi_value> res;

				        auto now = gms::arrival_window::clk::now();

				        for (auto& p : map) {

				            fd::endpoint_phi_value val;

				            val.endpoint = p.first.to_sstring();

				            val.phi = p.second.phi(now);

				            res.emplace_back(std::move(val));

				        }

				        return make_ready_future<json::json_return_type>(res);

				    });

				}

									
										2

api/failure_detector.hh
									
												View File
												
				@@ -18,6 +18,6 @@ class gossiper;

				namespace api {

				void set_failure_detector(http_context& ctx, routes& r, gms::gossiper& g);

				void set_failure_detector(http_context& ctx, httpd::routes& r, gms::gossiper& g);

				}

									
										25

api/gossiper.cc
									
												View File
												
				@@ -11,6 +11,7 @@

				#include "gms/gossiper.hh"

				namespace api {

				using namespace seastar::httpd;

				using namespace json;

				void set_gossiper(http_context& ctx, routes& r, gms::gossiper& g) {

				@@ -19,9 +20,11 @@ void set_gossiper(http_context& ctx, routes& r, gms::gossiper& g) {

				        return container_to_vec(res);

				    });

				    httpd::gossiper_json::get_live_endpoint.set(r, [&g] (const_req req) {

				        auto res = g.get_live_members();

				        return container_to_vec(res);

				    httpd::gossiper_json::get_live_endpoint.set(r, [&g] (std::unique_ptr<request> req) {

				        return g.get_live_members_synchronized().then([] (auto res) {

				            return make_ready_future<json::json_return_type>(container_to_vec(res));

				        });

				    });

				    httpd::gossiper_json::get_endpoint_downtime.set(r, [&g] (const_req req) {

				@@ -29,21 +32,21 @@ void set_gossiper(http_context& ctx, routes& r, gms::gossiper& g) {

				        return g.get_endpoint_downtime(ep);

				    });

				    httpd::gossiper_json::get_current_generation_number.set(r, [&g] (std::unique_ptr<request> req) {

				    httpd::gossiper_json::get_current_generation_number.set(r, [&g] (std::unique_ptr<http::request> req) {

				        gms::inet_address ep(req->param["addr"]);

				        return g.get_current_generation_number(ep).then([] (int res) {

				            return make_ready_future<json::json_return_type>(res);

				        return g.get_current_generation_number(ep).then([] (gms::generation_type res) {

				            return make_ready_future<json::json_return_type>(res.value());

				        });

				    });

				    httpd::gossiper_json::get_current_heart_beat_version.set(r, [&g] (std::unique_ptr<request> req) {

				    httpd::gossiper_json::get_current_heart_beat_version.set(r, [&g] (std::unique_ptr<http::request> req) {

				        gms::inet_address ep(req->param["addr"]);

				        return g.get_current_heart_beat_version(ep).then([] (int res) {

				            return make_ready_future<json::json_return_type>(res);

				        return g.get_current_heart_beat_version(ep).then([] (gms::version_type res) {

				            return make_ready_future<json::json_return_type>(res.value());

				        });

				    });

				    httpd::gossiper_json::assassinate_endpoint.set(r, [&g](std::unique_ptr<request> req) {

				    httpd::gossiper_json::assassinate_endpoint.set(r, [&g](std::unique_ptr<http::request> req) {

				        if (req->get_query_param("unsafe") != "True") {

				            return g.assassinate_endpoint(req->param["addr"]).then([] {

				                return make_ready_future<json::json_return_type>(json_void());

				@@ -54,7 +57,7 @@ void set_gossiper(http_context& ctx, routes& r, gms::gossiper& g) {

				        });

				    });

				    httpd::gossiper_json::force_remove_endpoint.set(r, [&g](std::unique_ptr<request> req) {

				    httpd::gossiper_json::force_remove_endpoint.set(r, [&g](std::unique_ptr<http::request> req) {

				        gms::inet_address ep(req->param["addr"]);

				        return g.force_remove_endpoint(ep).then([] {

				            return make_ready_future<json::json_return_type>(json_void());

									
										2

api/gossiper.hh
									
												View File
												
				@@ -18,6 +18,6 @@ class gossiper;

				namespace api {

				void set_gossiper(http_context& ctx, routes& r, gms::gossiper& g);

				void set_gossiper(http_context& ctx, httpd::routes& r, gms::gossiper& g);

				}

									
										17

api/hinted_handoff.cc
									
												View File
												
				@@ -19,10 +19,11 @@

				namespace api {

				using namespace json;

				using namespace seastar::httpd;

				namespace hh = httpd::hinted_handoff_json;

				void set_hinted_handoff(http_context& ctx, routes& r, gms::gossiper& g) {

				    hh::create_hints_sync_point.set(r, [&ctx, &g] (std::unique_ptr<request> req) -> future<json::json_return_type> {

				    hh::create_hints_sync_point.set(r, [&ctx, &g] (std::unique_ptr<http::request> req) -> future<json::json_return_type> {

				        auto parse_hosts_list = [&g] (sstring arg) {

				            std::vector<sstring> hosts_str = split(arg, ",");

				            std::vector<gms::inet_address> hosts;

				@@ -52,7 +53,7 @@ void set_hinted_handoff(http_context& ctx, routes& r, gms::gossiper& g) {

				        });

				    });

				    hh::get_hints_sync_point.set(r, [&ctx] (std::unique_ptr<request> req) -> future<json::json_return_type> {

				    hh::get_hints_sync_point.set(r, [&ctx] (std::unique_ptr<http::request> req) -> future<json::json_return_type> {

				        db::hints::sync_point sync_point;

				        const sstring encoded = req->get_query_param("id");

				        try {

				@@ -93,42 +94,42 @@ void set_hinted_handoff(http_context& ctx, routes& r, gms::gossiper& g) {

				        });

				    });

				    hh::list_endpoints_pending_hints.set(r, [] (std::unique_ptr<request> req) {

				    hh::list_endpoints_pending_hints.set(r, [] (std::unique_ptr<http::request> req) {

				        //TBD

				        unimplemented();

				        std::vector<sstring> res;

				        return make_ready_future<json::json_return_type>(res);

				    });

				    hh::truncate_all_hints.set(r, [] (std::unique_ptr<request> req) {

				    hh::truncate_all_hints.set(r, [] (std::unique_ptr<http::request> req) {

				        //TBD

				        unimplemented();

				        sstring host = req->get_query_param("host");

				        return make_ready_future<json::json_return_type>(json_void());

				    });

				    hh::schedule_hint_delivery.set(r, [] (std::unique_ptr<request> req) {

				    hh::schedule_hint_delivery.set(r, [] (std::unique_ptr<http::request> req) {

				        //TBD

				        unimplemented();

				        sstring host = req->get_query_param("host");

				        return make_ready_future<json::json_return_type>(json_void());

				    });

				    hh::pause_hints_delivery.set(r, [] (std::unique_ptr<request> req) {

				    hh::pause_hints_delivery.set(r, [] (std::unique_ptr<http::request> req) {

				        //TBD

				        unimplemented();

				        sstring pause = req->get_query_param("pause");

				        return make_ready_future<json::json_return_type>(json_void());

				    });

				    hh::get_create_hint_count.set(r, [] (std::unique_ptr<request> req) {

				    hh::get_create_hint_count.set(r, [] (std::unique_ptr<http::request> req) {

				        //TBD

				        unimplemented();

				        sstring host = req->get_query_param("host");

				        return make_ready_future<json::json_return_type>(0);

				    });

				    hh::get_not_stored_hints_count.set(r, [] (std::unique_ptr<request> req) {

				    hh::get_not_stored_hints_count.set(r, [] (std::unique_ptr<http::request> req) {

				        //TBD

				        unimplemented();

				        sstring host = req->get_query_param("host");

									
										4

api/hinted_handoff.hh
									
												View File
												
				@@ -18,7 +18,7 @@ class gossiper;

				namespace api {

				void set_hinted_handoff(http_context& ctx, routes& r, gms::gossiper& g);

				void unset_hinted_handoff(http_context& ctx, routes& r);

				void set_hinted_handoff(http_context& ctx, httpd::routes& r, gms::gossiper& g);

				void unset_hinted_handoff(http_context& ctx, httpd::routes& r);

				}

									
										1

api/lsa.cc
									
												View File
												
				@@ -16,6 +16,7 @@

				#include "replica/database.hh"

				namespace api {

				using namespace seastar::httpd;

				static logging::logger alogger("lsa-api");

									
										2

api/lsa.hh
									
												View File
												
				@@ -12,6 +12,6 @@

				namespace api {

				void set_lsa(http_context& ctx, routes& r);

				void set_lsa(http_context& ctx, httpd::routes& r);

				}

									
										3

api/messaging_service.cc
									
												View File
												
				@@ -13,6 +13,7 @@

				#include <iostream>

				#include <sstream>

				using namespace seastar::httpd;

				using namespace httpd::messaging_service_json;

				using namespace netw;

				@@ -28,7 +29,7 @@ std::vector<message_counter> map_to_message_counters(

				    std::vector<message_counter> res;

				    for (auto i : map) {

				        res.push_back(message_counter());

				        res.back().key = boost::lexical_cast<sstring>(i.first);

				        res.back().key = fmt::to_string(i.first);

				        res.back().value = i.second;

				    }

				    return res;

									
										4

api/messaging_service.hh
									
												View File
												
				@@ -14,7 +14,7 @@ namespace netw { class messaging_service; }

				namespace api {

				void set_messaging_service(http_context& ctx, routes& r, sharded<netw::messaging_service>& ms);

				void unset_messaging_service(http_context& ctx, routes& r);

				void set_messaging_service(http_context& ctx, httpd::routes& r, sharded<netw::messaging_service>& ms);

				void unset_messaging_service(http_context& ctx, httpd::routes& r);

				}

									
										196

api/storage_proxy.cc
									
												View File
												
				@@ -20,6 +20,7 @@ namespace api {

				namespace sp = httpd::storage_proxy_json;

				using proxy = service::storage_proxy;

				using namespace seastar::httpd;

				using namespace json;

				utils::time_estimated_histogram timed_rate_moving_average_summary_merge(utils::time_estimated_histogram a, const utils::timed_rate_moving_average_summary_and_histogram& b) {

				@@ -184,75 +185,75 @@ sum_timer_stats_storage_proxy(distributed<proxy>& d,

				}

				void set_storage_proxy(http_context& ctx, routes& r, sharded<service::storage_service>& ss) {

				    sp::get_total_hints.set(r, [](std::unique_ptr<request> req)  {

				    sp::get_total_hints.set(r, [](std::unique_ptr<http::request> req)  {

				        //TBD

				        unimplemented();

				        return make_ready_future<json::json_return_type>(0);

				    });

				    sp::get_hinted_handoff_enabled.set(r, [&ctx](std::unique_ptr<request> req)  {

				        const auto& filter = service::get_storage_proxy().local().get_hints_host_filter();

				    sp::get_hinted_handoff_enabled.set(r, [&ctx](std::unique_ptr<http::request> req)  {

				        const auto& filter = ctx.sp.local().get_hints_host_filter();

				        return make_ready_future<json::json_return_type>(!filter.is_disabled_for_all());

				    });

				    sp::set_hinted_handoff_enabled.set(r, [](std::unique_ptr<request> req)  {

				    sp::set_hinted_handoff_enabled.set(r, [&ctx](std::unique_ptr<http::request> req)  {

				        auto enable = req->get_query_param("enable");

				        auto filter = (enable == "true" || enable == "1")

				                ? db::hints::host_filter(db::hints::host_filter::enabled_for_all_tag {})

				                : db::hints::host_filter(db::hints::host_filter::disabled_for_all_tag {});

				        return service::get_storage_proxy().invoke_on_all([filter = std::move(filter)] (service::storage_proxy& sp) {

				        return ctx.sp.invoke_on_all([filter = std::move(filter)] (service::storage_proxy& sp) {

				            return sp.change_hints_host_filter(filter);

				        }).then([] {

				            return make_ready_future<json::json_return_type>(json_void());

				        });

				    });

				    sp::get_hinted_handoff_enabled_by_dc.set(r, [](std::unique_ptr<request> req)  {

				    sp::get_hinted_handoff_enabled_by_dc.set(r, [&ctx](std::unique_ptr<http::request> req)  {

				        std::vector<sstring> res;

				        const auto& filter = service::get_storage_proxy().local().get_hints_host_filter();

				        const auto& filter = ctx.sp.local().get_hints_host_filter();

				        const auto& dcs = filter.get_dcs();

				        res.reserve(res.size());

				        std::copy(dcs.begin(), dcs.end(), std::back_inserter(res));

				        return make_ready_future<json::json_return_type>(res);

				    });

				    sp::set_hinted_handoff_enabled_by_dc_list.set(r, [](std::unique_ptr<request> req)  {

				    sp::set_hinted_handoff_enabled_by_dc_list.set(r, [&ctx](std::unique_ptr<http::request> req)  {

				        auto dcs = req->get_query_param("dcs");

				        auto filter = db::hints::host_filter::parse_from_dc_list(std::move(dcs));

				        return service::get_storage_proxy().invoke_on_all([filter = std::move(filter)] (service::storage_proxy& sp) {

				        return ctx.sp.invoke_on_all([filter = std::move(filter)] (service::storage_proxy& sp) {

				            return sp.change_hints_host_filter(filter);

				        }).then([] {

				            return make_ready_future<json::json_return_type>(json_void());

				        });

				    });

				    sp::get_max_hint_window.set(r, [](std::unique_ptr<request> req)  {

				    sp::get_max_hint_window.set(r, [](std::unique_ptr<http::request> req)  {

				        //TBD

				        unimplemented();

				        return make_ready_future<json::json_return_type>(0);

				    });

				    sp::set_max_hint_window.set(r, [](std::unique_ptr<request> req)  {

				    sp::set_max_hint_window.set(r, [](std::unique_ptr<http::request> req)  {

				        //TBD

				        unimplemented();

				        auto enable = req->get_query_param("ms");

				        return make_ready_future<json::json_return_type>(json_void());

				    });

				    sp::get_max_hints_in_progress.set(r, [](std::unique_ptr<request> req)  {

				    sp::get_max_hints_in_progress.set(r, [](std::unique_ptr<http::request> req)  {

				        //TBD

				        unimplemented();

				        return make_ready_future<json::json_return_type>(1);

				    });

				    sp::set_max_hints_in_progress.set(r, [](std::unique_ptr<request> req)  {

				    sp::set_max_hints_in_progress.set(r, [](std::unique_ptr<http::request> req)  {

				        //TBD

				        unimplemented();

				        auto enable = req->get_query_param("qs");

				        return make_ready_future<json::json_return_type>(json_void());

				    });

				    sp::get_hints_in_progress.set(r, [](std::unique_ptr<request> req)  {

				    sp::get_hints_in_progress.set(r, [](std::unique_ptr<http::request> req)  {

				        //TBD

				        unimplemented();

				        return make_ready_future<json::json_return_type>(0);

				@@ -262,7 +263,7 @@ void set_storage_proxy(http_context& ctx, routes& r, sharded<service::storage_se

				        return ctx.db.local().get_config().request_timeout_in_ms()/1000.0;

				    });

				    sp::set_rpc_timeout.set(r, [](std::unique_ptr<request> req)  {

				    sp::set_rpc_timeout.set(r, [](std::unique_ptr<http::request> req)  {

				        //TBD

				        unimplemented();

				        auto enable = req->get_query_param("timeout");

				@@ -273,7 +274,7 @@ void set_storage_proxy(http_context& ctx, routes& r, sharded<service::storage_se

				        return ctx.db.local().get_config().read_request_timeout_in_ms()/1000.0;

				    });

				    sp::set_read_rpc_timeout.set(r, [](std::unique_ptr<request> req)  {

				    sp::set_read_rpc_timeout.set(r, [](std::unique_ptr<http::request> req)  {

				        //TBD

				        unimplemented();

				        auto enable = req->get_query_param("timeout");

				@@ -284,7 +285,7 @@ void set_storage_proxy(http_context& ctx, routes& r, sharded<service::storage_se

				        return ctx.db.local().get_config().write_request_timeout_in_ms()/1000.0;

				    });

				    sp::set_write_rpc_timeout.set(r, [](std::unique_ptr<request> req)  {

				    sp::set_write_rpc_timeout.set(r, [](std::unique_ptr<http::request> req)  {

				        //TBD

				        unimplemented();

				        auto enable = req->get_query_param("timeout");

				@@ -295,7 +296,7 @@ void set_storage_proxy(http_context& ctx, routes& r, sharded<service::storage_se

				        return ctx.db.local().get_config().counter_write_request_timeout_in_ms()/1000.0;

				    });

				    sp::set_counter_write_rpc_timeout.set(r, [](std::unique_ptr<request> req)  {

				    sp::set_counter_write_rpc_timeout.set(r, [](std::unique_ptr<http::request> req)  {

				        //TBD

				        unimplemented();

				        auto enable = req->get_query_param("timeout");

				@@ -306,7 +307,7 @@ void set_storage_proxy(http_context& ctx, routes& r, sharded<service::storage_se

				        return ctx.db.local().get_config().cas_contention_timeout_in_ms()/1000.0;

				    });

				    sp::set_cas_contention_timeout.set(r, [](std::unique_ptr<request> req)  {

				    sp::set_cas_contention_timeout.set(r, [](std::unique_ptr<http::request> req)  {

				        //TBD

				        unimplemented();

				        auto enable = req->get_query_param("timeout");

				@@ -317,7 +318,7 @@ void set_storage_proxy(http_context& ctx, routes& r, sharded<service::storage_se

				        return ctx.db.local().get_config().range_request_timeout_in_ms()/1000.0;

				    });

				    sp::set_range_rpc_timeout.set(r, [](std::unique_ptr<request> req)  {

				    sp::set_range_rpc_timeout.set(r, [](std::unique_ptr<http::request> req)  {

				        //TBD

				        unimplemented();

				        auto enable = req->get_query_param("timeout");

				@@ -328,32 +329,32 @@ void set_storage_proxy(http_context& ctx, routes& r, sharded<service::storage_se

				        return ctx.db.local().get_config().truncate_request_timeout_in_ms()/1000.0;

				    });

				    sp::set_truncate_rpc_timeout.set(r, [](std::unique_ptr<request> req)  {

				    sp::set_truncate_rpc_timeout.set(r, [](std::unique_ptr<http::request> req)  {

				        //TBD

				        unimplemented();

				        auto enable = req->get_query_param("timeout");

				        return make_ready_future<json::json_return_type>(json_void());

				    });

				    sp::reload_trigger_classes.set(r, [](std::unique_ptr<request> req)  {

				    sp::reload_trigger_classes.set(r, [](std::unique_ptr<http::request> req)  {

				        //TBD

				        unimplemented();

				        return make_ready_future<json::json_return_type>(json_void());

				    });

				    sp::get_read_repair_attempted.set(r, [&ctx](std::unique_ptr<request> req)  {

				    sp::get_read_repair_attempted.set(r, [&ctx](std::unique_ptr<http::request> req)  {

				        return sum_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::read_repair_attempts);

				    });

				    sp::get_read_repair_repaired_blocking.set(r, [&ctx](std::unique_ptr<request> req)  {

				    sp::get_read_repair_repaired_blocking.set(r, [&ctx](std::unique_ptr<http::request> req)  {

				        return sum_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::read_repair_repaired_blocking);

				    });

				    sp::get_read_repair_repaired_background.set(r, [&ctx](std::unique_ptr<request> req)  {

				    sp::get_read_repair_repaired_background.set(r, [&ctx](std::unique_ptr<http::request> req)  {

				        return sum_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::read_repair_repaired_background);

				    });

				    sp::get_schema_versions.set(r, [&ss](std::unique_ptr<request> req)  {

				    sp::get_schema_versions.set(r, [&ss](std::unique_ptr<http::request> req)  {

				        return ss.local().describe_schema_versions().then([] (auto result) {

				            std::vector<sp::mapper_list> res;

				            for (auto e : result) {

				@@ -366,122 +367,122 @@ void set_storage_proxy(http_context& ctx, routes& r, sharded<service::storage_se

				        });

				    });

				    sp::get_cas_read_timeouts.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_cas_read_timeouts.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_timed_rate_as_long(ctx.sp, &proxy::stats::cas_read_timeouts);

				    });

				    sp::get_cas_read_unavailables.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_cas_read_unavailables.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_timed_rate_as_long(ctx.sp, &proxy::stats::cas_read_unavailables);

				    });

				    sp::get_cas_write_timeouts.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_cas_write_timeouts.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_timed_rate_as_long(ctx.sp, &proxy::stats::cas_write_timeouts);

				    });

				    sp::get_cas_write_unavailables.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_cas_write_unavailables.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_timed_rate_as_long(ctx.sp, &proxy::stats::cas_write_unavailables);

				    });

				    sp::get_cas_write_metrics_unfinished_commit.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_cas_write_metrics_unfinished_commit.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_stats(ctx.sp, &proxy::stats::cas_write_unfinished_commit);

				    });

				    sp::get_cas_write_metrics_contention.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_cas_write_metrics_contention.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_estimated_histogram(ctx, &proxy::stats::cas_write_contention);

				    });

				    sp::get_cas_write_metrics_condition_not_met.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_cas_write_metrics_condition_not_met.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_stats(ctx.sp, &proxy::stats::cas_write_condition_not_met);

				    });

				    sp::get_cas_write_metrics_failed_read_round_optimization.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_cas_write_metrics_failed_read_round_optimization.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_stats(ctx.sp, &proxy::stats::cas_failed_read_round_optimization);

				    });

				    sp::get_cas_read_metrics_unfinished_commit.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_cas_read_metrics_unfinished_commit.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_stats(ctx.sp, &proxy::stats::cas_read_unfinished_commit);

				    });

				    sp::get_cas_read_metrics_contention.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_cas_read_metrics_contention.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_estimated_histogram(ctx, &proxy::stats::cas_read_contention);

				    });

				    sp::get_read_metrics_timeouts.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_read_metrics_timeouts.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_timed_rate_as_long(ctx.sp, &service::storage_proxy_stats::stats::read_timeouts);

				    });

				    sp::get_read_metrics_unavailables.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_read_metrics_unavailables.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_timed_rate_as_long(ctx.sp, &service::storage_proxy_stats::stats::read_unavailables);

				    });

				    sp::get_range_metrics_timeouts.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_range_metrics_timeouts.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_timed_rate_as_long(ctx.sp, &service::storage_proxy_stats::stats::range_slice_timeouts);

				    });

				    sp::get_range_metrics_unavailables.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_range_metrics_unavailables.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_timed_rate_as_long(ctx.sp, &service::storage_proxy_stats::stats::range_slice_unavailables);

				    });

				    sp::get_write_metrics_timeouts.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_write_metrics_timeouts.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_timed_rate_as_long(ctx.sp, &service::storage_proxy_stats::stats::write_timeouts);

				    });

				    sp::get_write_metrics_unavailables.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_write_metrics_unavailables.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_timed_rate_as_long(ctx.sp, &service::storage_proxy_stats::stats::write_unavailables);

				    });

				    sp::get_read_metrics_timeouts_rates.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_read_metrics_timeouts_rates.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_timed_rate_as_obj(ctx.sp, &service::storage_proxy_stats::stats::read_timeouts);

				    });

				    sp::get_read_metrics_unavailables_rates.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_read_metrics_unavailables_rates.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_timed_rate_as_obj(ctx.sp, &service::storage_proxy_stats::stats::read_unavailables);

				    });

				    sp::get_range_metrics_timeouts_rates.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_range_metrics_timeouts_rates.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_timed_rate_as_obj(ctx.sp, &service::storage_proxy_stats::stats::range_slice_timeouts);

				    });

				    sp::get_range_metrics_unavailables_rates.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_range_metrics_unavailables_rates.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_timed_rate_as_obj(ctx.sp, &service::storage_proxy_stats::stats::range_slice_unavailables);

				    });

				    sp::get_write_metrics_timeouts_rates.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_write_metrics_timeouts_rates.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_timed_rate_as_obj(ctx.sp, &service::storage_proxy_stats::stats::write_timeouts);

				    });

				    sp::get_write_metrics_unavailables_rates.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_write_metrics_unavailables_rates.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_timed_rate_as_obj(ctx.sp, &service::storage_proxy_stats::stats::write_unavailables);

				    });

				    sp::get_range_metrics_latency_histogram_depricated.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_range_metrics_latency_histogram_depricated.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_histogram_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::range);

				    });

				    sp::get_write_metrics_latency_histogram_depricated.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_write_metrics_latency_histogram_depricated.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_histogram_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::write);

				    });

				    sp::get_read_metrics_latency_histogram_depricated.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_read_metrics_latency_histogram_depricated.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_histogram_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::read);

				    });

				    sp::get_range_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_range_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_timer_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::range);

				    });

				    sp::get_write_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_write_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_timer_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::write);

				    });

				    sp::get_cas_write_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_cas_write_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_timer_stats(ctx.sp, &proxy::stats::cas_write);

				    });

				    sp::get_cas_read_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_cas_read_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_timer_stats(ctx.sp, &proxy::stats::cas_read);

				    });

				    sp::get_view_write_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_view_write_metrics_latency_histogram.set(r, [](std::unique_ptr<http::request> req) {

				        //TBD

				        // FIXME

				        // No View metrics are available, so just return empty moving average

				@@ -489,32 +490,101 @@ void set_storage_proxy(http_context& ctx, routes& r, sharded<service::storage_se

				        return make_ready_future<json::json_return_type>(get_empty_moving_average());

				    });

				    sp::get_read_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_read_metrics_latency_histogram.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_timer_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::read);

				    });

				    sp::get_read_estimated_histogram.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_read_estimated_histogram.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_estimated_histogram(ctx, &service::storage_proxy_stats::stats::read);

				    });

				    sp::get_read_latency.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_read_latency.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return total_latency(ctx, &service::storage_proxy_stats::stats::read);

				    });

				    sp::get_write_estimated_histogram.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_write_estimated_histogram.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_estimated_histogram(ctx, &service::storage_proxy_stats::stats::write);

				    });

				    sp::get_write_latency.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_write_latency.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return total_latency(ctx, &service::storage_proxy_stats::stats::write);

				    });

				    sp::get_range_estimated_histogram.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_range_estimated_histogram.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return sum_timer_stats_storage_proxy(ctx.sp, &service::storage_proxy_stats::stats::range);

				    });

				    sp::get_range_latency.set(r, [&ctx](std::unique_ptr<request> req) {

				    sp::get_range_latency.set(r, [&ctx](std::unique_ptr<http::request> req) {

				        return total_latency(ctx, &service::storage_proxy_stats::stats::range);

				    });

				}

				void unset_storage_proxy(http_context& ctx, routes& r) {

				    sp::get_total_hints.unset(r);

				    sp::get_hinted_handoff_enabled.unset(r);

				    sp::set_hinted_handoff_enabled.unset(r);

				    sp::get_hinted_handoff_enabled_by_dc.unset(r);

				    sp::set_hinted_handoff_enabled_by_dc_list.unset(r);

				    sp::get_max_hint_window.unset(r);

				    sp::set_max_hint_window.unset(r);

				    sp::get_max_hints_in_progress.unset(r);

				    sp::set_max_hints_in_progress.unset(r);

				    sp::get_hints_in_progress.unset(r);

				    sp::get_rpc_timeout.unset(r);

				    sp::set_rpc_timeout.unset(r);

				    sp::get_read_rpc_timeout.unset(r);

				    sp::set_read_rpc_timeout.unset(r);

				    sp::get_write_rpc_timeout.unset(r);

				    sp::set_write_rpc_timeout.unset(r);

				    sp::get_counter_write_rpc_timeout.unset(r);

				    sp::set_counter_write_rpc_timeout.unset(r);

				    sp::get_cas_contention_timeout.unset(r);

				    sp::set_cas_contention_timeout.unset(r);

				    sp::get_range_rpc_timeout.unset(r);

				    sp::set_range_rpc_timeout.unset(r);

				    sp::get_truncate_rpc_timeout.unset(r);

				    sp::set_truncate_rpc_timeout.unset(r);

				    sp::reload_trigger_classes.unset(r);

				    sp::get_read_repair_attempted.unset(r);

				    sp::get_read_repair_repaired_blocking.unset(r);

				    sp::get_read_repair_repaired_background.unset(r);

				    sp::get_schema_versions.unset(r);

				    sp::get_cas_read_timeouts.unset(r);

				    sp::get_cas_read_unavailables.unset(r);

				    sp::get_cas_write_timeouts.unset(r);

				    sp::get_cas_write_unavailables.unset(r);

				    sp::get_cas_write_metrics_unfinished_commit.unset(r);

				    sp::get_cas_write_metrics_contention.unset(r);

				    sp::get_cas_write_metrics_condition_not_met.unset(r);

				    sp::get_cas_write_metrics_failed_read_round_optimization.unset(r);

				    sp::get_cas_read_metrics_unfinished_commit.unset(r);

				    sp::get_cas_read_metrics_contention.unset(r);

				    sp::get_read_metrics_timeouts.unset(r);

				    sp::get_read_metrics_unavailables.unset(r);

				    sp::get_range_metrics_timeouts.unset(r);

				    sp::get_range_metrics_unavailables.unset(r);

				    sp::get_write_metrics_timeouts.unset(r);

				    sp::get_write_metrics_unavailables.unset(r);

				    sp::get_read_metrics_timeouts_rates.unset(r);

				    sp::get_read_metrics_unavailables_rates.unset(r);

				    sp::get_range_metrics_timeouts_rates.unset(r);

				    sp::get_range_metrics_unavailables_rates.unset(r);

				    sp::get_write_metrics_timeouts_rates.unset(r);

				    sp::get_write_metrics_unavailables_rates.unset(r);

				    sp::get_range_metrics_latency_histogram_depricated.unset(r);

				    sp::get_write_metrics_latency_histogram_depricated.unset(r);

				    sp::get_read_metrics_latency_histogram_depricated.unset(r);

				    sp::get_range_metrics_latency_histogram.unset(r);

				    sp::get_write_metrics_latency_histogram.unset(r);

				    sp::get_cas_write_metrics_latency_histogram.unset(r);

				    sp::get_cas_read_metrics_latency_histogram.unset(r);

				    sp::get_view_write_metrics_latency_histogram.unset(r);

				    sp::get_read_metrics_latency_histogram.unset(r);

				    sp::get_read_estimated_histogram.unset(r);

				    sp::get_read_latency.unset(r);

				    sp::get_write_estimated_histogram.unset(r);

				    sp::get_write_latency.unset(r);

				    sp::get_range_estimated_histogram.unset(r);

				    sp::get_range_latency.unset(r);

				}

				}

									
										3

api/storage_proxy.hh
									
												View File
												
				@@ -15,6 +15,7 @@ namespace service { class storage_service; }

				namespace api {

				void set_storage_proxy(http_context& ctx, routes& r, sharded<service::storage_service>& ss);

				void set_storage_proxy(http_context& ctx, httpd::routes& r, sharded<service::storage_service>& ss);

				void unset_storage_proxy(http_context& ctx, httpd::routes& r);

				}

467

api/storage_service.cc

View File

File diff suppressed because it is too large Load Diff

									
										56

api/storage_service.hh
									
												View File
												
				@@ -8,6 +8,8 @@

				#pragma once

				#include <iostream>

				#include <seastar/core/sharded.hh>

				#include "api.hh"

				#include "db/data_listeners.hh"

				@@ -34,28 +36,52 @@ class gossiper;

				namespace api {

				// verify that the keyspace is found, otherwise a bad_param_exception exception is thrown

				// containing the description of the respective keyspace error.

				sstring validate_keyspace(http_context& ctx, sstring ks_name);

				// verify that the keyspace parameter is found, otherwise a bad_param_exception exception is thrown

				// containing the description of the respective keyspace error.

				sstring validate_keyspace(http_context& ctx, const parameters& param);

				sstring validate_keyspace(http_context& ctx, const httpd::parameters& param);

				// splits a request parameter assumed to hold a comma-separated list of table names

				// verify that the tables are found, otherwise a bad_param_exception exception is thrown

				// containing the description of the respective no_such_column_family error.

				// Returns an empty vector if no parameter was found.

				// If the parameter is found and empty, returns a list of all table names in the keyspace.

				std::vector<sstring> parse_tables(const sstring& ks_name, http_context& ctx, const std::unordered_map<sstring, sstring>& query_params, sstring param_name);

				void set_storage_service(http_context& ctx, routes& r, sharded<service::storage_service>& ss, gms::gossiper& g, sharded<cdc::generation_service>& cdc_gs, sharded<db::system_keyspace>& sys_ls);

				void set_sstables_loader(http_context& ctx, routes& r, sharded<sstables_loader>& sst_loader);

				void unset_sstables_loader(http_context& ctx, routes& r);

				void set_view_builder(http_context& ctx, routes& r, sharded<db::view::view_builder>& vb);

				void unset_view_builder(http_context& ctx, routes& r);

				void set_repair(http_context& ctx, routes& r, sharded<repair_service>& repair);

				void unset_repair(http_context& ctx, routes& r);

				void set_transport_controller(http_context& ctx, routes& r, cql_transport::controller& ctl);

				void unset_transport_controller(http_context& ctx, routes& r);

				void set_rpc_controller(http_context& ctx, routes& r, thrift_controller& ctl);

				void unset_rpc_controller(http_context& ctx, routes& r);

				void set_snapshot(http_context& ctx, routes& r, sharded<db::snapshot_ctl>& snap_ctl);

				void unset_snapshot(http_context& ctx, routes& r);

				struct table_info {

				    sstring name;

				    table_id id;

				};

				// splits a request parameter assumed to hold a comma-separated list of table names

				// verify that the tables are found, otherwise a bad_param_exception exception is thrown

				// containing the description of the respective no_such_column_family error.

				// Returns a vector of all table infos given by the parameter, or

				// if the parameter is not found or is empty, returns a list of all table infos in the keyspace.

				std::vector<table_info> parse_table_infos(const sstring& ks_name, http_context& ctx, const std::unordered_map<sstring, sstring>& query_params, sstring param_name);

				void set_storage_service(http_context& ctx, httpd::routes& r, sharded<service::storage_service>& ss, gms::gossiper& g, sharded<cdc::generation_service>& cdc_gs, sharded<db::system_keyspace>& sys_ls);

				void set_sstables_loader(http_context& ctx, httpd::routes& r, sharded<sstables_loader>& sst_loader);

				void unset_sstables_loader(http_context& ctx, httpd::routes& r);

				void set_view_builder(http_context& ctx, httpd::routes& r, sharded<db::view::view_builder>& vb);

				void unset_view_builder(http_context& ctx, httpd::routes& r);

				void set_repair(http_context& ctx, httpd::routes& r, sharded<repair_service>& repair);

				void unset_repair(http_context& ctx, httpd::routes& r);

				void set_transport_controller(http_context& ctx, httpd::routes& r, cql_transport::controller& ctl);

				void unset_transport_controller(http_context& ctx, httpd::routes& r);

				void set_rpc_controller(http_context& ctx, httpd::routes& r, thrift_controller& ctl);

				void unset_rpc_controller(http_context& ctx, httpd::routes& r);

				void set_snapshot(http_context& ctx, httpd::routes& r, sharded<db::snapshot_ctl>& snap_ctl);

				void unset_snapshot(http_context& ctx, httpd::routes& r);

				seastar::future<json::json_return_type> run_toppartitions_query(db::toppartitions_query& q, http_context &ctx, bool legacy_request = false);

				}

				} // namespace api

				namespace std {

				std::ostream& operator<<(std::ostream& os, const api::table_info& ti);

				} // namespace std

									
										7

api/stream_manager.cc
									
												View File
												
				@@ -14,6 +14,7 @@

				#include "gms/gossiper.hh"

				namespace api {

				using namespace seastar::httpd;

				namespace hs = httpd::stream_manager_json;

				@@ -21,7 +22,7 @@ static void set_summaries(const std::vector<streaming::stream_summary>& from,

				        json::json_list<hs::stream_summary>& to) {

				    if (!from.empty()) {

				        hs::stream_summary res;

				        res.cf_id = boost::lexical_cast<std::string>(from.front().cf_id);

				        res.cf_id = fmt::to_string(from.front().cf_id);

				        // For each stream_session, we pretend we are sending/receiving one

				        // file, to make it compatible with nodetool.

				        res.files = 1;

				@@ -38,7 +39,7 @@ static hs::progress_info get_progress_info(const streaming::progress_info& info)

				    res.current_bytes = info.current_bytes;

				    res.direction = info.dir;

				    res.file_name = info.file_name;

				    res.peer = boost::lexical_cast<std::string>(info.peer);

				    res.peer = fmt::to_string(info.peer);

				    res.session_index = 0;

				    res.total_bytes = info.total_bytes;

				    return res;

				@@ -61,7 +62,7 @@ static hs::stream_state get_state(

				    state.plan_id = result_future.plan_id.to_sstring();

				    for (auto info : result_future.get_coordinator().get()->get_all_session_info()) {

				        hs::stream_info si;

				        si.peer = boost::lexical_cast<std::string>(info.peer);

				        si.peer = fmt::to_string(info.peer);

				        si.session_index = 0;

				        si.state = info.state;

				        si.connecting = si.peer;

									
										4

api/stream_manager.hh
									
												View File
												
				@@ -12,7 +12,7 @@

				namespace api {

				void set_stream_manager(http_context& ctx, routes& r, sharded<streaming::stream_manager>& sm);

				void unset_stream_manager(http_context& ctx, routes& r);

				void set_stream_manager(http_context& ctx, httpd::routes& r, sharded<streaming::stream_manager>& sm);

				void unset_stream_manager(http_context& ctx, httpd::routes& r);

				}

									
										1

api/system.cc
									
												View File
												
				@@ -17,6 +17,7 @@

				extern logging::logger apilog;

				namespace api {

				using namespace seastar::httpd;

				namespace hs = httpd::system_json;

									
										2

api/system.hh
									
												View File
												
				@@ -12,6 +12,6 @@

				namespace api {

				void set_system(http_context& ctx, routes& r);

				void set_system(http_context& ctx, httpd::routes& r);

				}

									
										103

api/task_manager.cc
									
												View File
												
				@@ -22,6 +22,7 @@ namespace api {

				namespace tm = httpd::task_manager_json;

				using namespace json;

				using namespace seastar::httpd;

				inline bool filter_tasks(tasks::task_manager::task_ptr task, std::unordered_map<sstring, sstring>& query_params) {

				    return (!query_params.contains("keyspace") || query_params["keyspace"] == task->get_status().keyspace) &&

				@@ -30,17 +31,32 @@ inline bool filter_tasks(tasks::task_manager::task_ptr task, std::unordered_map<

				struct full_task_status {

				    tasks::task_manager::task::status task_status;

				    std::string type;

				    tasks::task_manager::task::progress progress;

				    std::string module;

				    tasks::task_id parent_id;

				    tasks::is_abortable abortable;

				    std::vector<std::string> children_ids;

				};

				struct task_stats {

				    task_stats(tasks::task_manager::task_ptr task) : task_id(task->id().to_sstring()), state(task->get_status().state) {}

				    task_stats(tasks::task_manager::task_ptr task)

				        : task_id(task->id().to_sstring())

				        , state(task->get_status().state)

				        , type(task->type())

				        , keyspace(task->get_status().keyspace)

				        , table(task->get_status().table)

				        , entity(task->get_status().entity)

				        , sequence_number(task->get_status().sequence_number)

				    { }

				    sstring task_id;

				    tasks::task_manager::task_state state;

				    std::string type;

				    std::string keyspace;

				    std::string table;

				    std::string entity;

				    uint64_t sequence_number;

				};

				tm::task_status make_status(full_task_status status) {

				@@ -52,7 +68,7 @@ tm::task_status make_status(full_task_status status) {

				    tm::task_status res{};

				    res.id = status.task_status.id.to_sstring();

				    res.type = status.task_status.type;

				    res.type = status.type;

				    res.state = status.task_status.state;

				    res.is_abortable = bool(status.abortable);

				    res.start_time = st;

				@@ -67,37 +83,45 @@ tm::task_status make_status(full_task_status status) {

				    res.progress_units = status.task_status.progress_units;

				    res.progress_total = status.progress.total;

				    res.progress_completed = status.progress.completed;

				    res.children_ids = std::move(status.children_ids);

				    return res;

				}

				future<json::json_return_type> retrieve_status(tasks::task_manager::foreign_task_ptr task) {

				future<full_task_status> retrieve_status(const tasks::task_manager::foreign_task_ptr& task) {

				    if (task.get() == nullptr) {

				        co_return coroutine::return_exception(httpd::bad_param_exception("Task not found"));

				    }

				    auto progress = co_await task->get_progress();

				    full_task_status s;

				    s.task_status = task->get_status();

				    s.type = task->type();

				    s.parent_id = task->get_parent_id();

				    s.abortable = task->is_abortable();

				    s.module = task->get_module_name();

				    s.progress.completed = progress.completed;

				    s.progress.total = progress.total;

				    co_return make_status(s);

				    std::vector<std::string> ct{task->get_children().size()};

				    boost::transform(task->get_children(), ct.begin(), [] (const auto& child) {

				        return child->id().to_sstring();

				    });

				    s.children_ids = std::move(ct);

				    co_return s;

				}

				void set_task_manager(http_context& ctx, routes& r) {

				    tm::get_modules.set(r, [&ctx] (std::unique_ptr<request> req) -> future<json::json_return_type> {

				void set_task_manager(http_context& ctx, routes& r, db::config& cfg) {

				    tm::get_modules.set(r, [&ctx] (std::unique_ptr<http::request> req) -> future<json::json_return_type> {

				        std::vector<std::string> v = boost::copy_range<std::vector<std::string>>(ctx.tm.local().get_modules() | boost::adaptors::map_keys);

				        co_return v;

				    });

				    tm::get_tasks.set(r, [&ctx] (std::unique_ptr<request> req) -> future<json::json_return_type> {

				    tm::get_tasks.set(r, [&ctx] (std::unique_ptr<http::request> req) -> future<json::json_return_type> {

				        using chunked_stats = utils::chunked_vector<task_stats>;

				        std::vector<chunked_stats> res = co_await ctx.tm.map([&req] (tasks::task_manager& tm) {

				        auto internal = tasks::is_internal{req_param<bool>(*req, "internal", false)};

				        std::vector<chunked_stats> res = co_await ctx.tm.map([&req, internal] (tasks::task_manager& tm) {

				            chunked_stats local_res;

				            auto module = tm.find_module(req->param["module"]);

				            const auto& filtered_tasks = module->get_tasks() | boost::adaptors::filtered([&params = req->query_parameters] (const auto& task) {

				                return filter_tasks(task.second, params);

				            const auto& filtered_tasks = module->get_tasks() | boost::adaptors::filtered([&params = req->query_parameters, internal] (const auto& task) {

				                return (internal || !task.second->is_internal()) && filter_tasks(task.second, params);

				            });

				            for (auto& [task_id, task] : filtered_tasks) {

				                local_res.push_back(task_stats{task});

				@@ -124,7 +148,7 @@ void set_task_manager(http_context& ctx, routes& r) {

				        co_return std::move(f);

				    });

				    tm::get_task_status.set(r, [&ctx] (std::unique_ptr<request> req) -> future<json::json_return_type> {

				    tm::get_task_status.set(r, [&ctx] (std::unique_ptr<http::request> req) -> future<json::json_return_type> {

				        auto id = tasks::task_id{utils::UUID{req->param["task_id"]}};

				        auto task = co_await tasks::task_manager::invoke_on_task(ctx.tm, id, std::function([] (tasks::task_manager::task_ptr task) -> future<tasks::task_manager::foreign_task_ptr> {

				            auto state = task->get_status().state;

				@@ -133,10 +157,11 @@ void set_task_manager(http_context& ctx, routes& r) {

				            }

				            co_return std::move(task);

				        }));

				        co_return co_await retrieve_status(std::move(task));

				        auto s = co_await retrieve_status(task);

				        co_return make_status(s);

				    });

				    tm::abort_task.set(r, [&ctx] (std::unique_ptr<request> req) -> future<json::json_return_type> {

				    tm::abort_task.set(r, [&ctx] (std::unique_ptr<http::request> req) -> future<json::json_return_type> {

				        auto id = tasks::task_id{utils::UUID{req->param["task_id"]}};

				        co_await tasks::task_manager::invoke_on_task(ctx.tm, id, [] (tasks::task_manager::task_ptr task) -> future<> {

				            if (!task->is_abortable()) {

				@@ -147,7 +172,7 @@ void set_task_manager(http_context& ctx, routes& r) {

				        co_return json_void();

				    });

				    tm::wait_task.set(r, [&ctx] (std::unique_ptr<request> req) -> future<json::json_return_type> {

				    tm::wait_task.set(r, [&ctx] (std::unique_ptr<http::request> req) -> future<json::json_return_type> {

				        auto id = tasks::task_id{utils::UUID{req->param["task_id"]}};

				        auto task = co_await tasks::task_manager::invoke_on_task(ctx.tm, id, std::function([] (tasks::task_manager::task_ptr task) {

				            return task->done().then_wrapped([task] (auto f) {

				@@ -156,7 +181,55 @@ void set_task_manager(http_context& ctx, routes& r) {

				                return make_foreign(task);

				            });

				        }));

				        co_return co_await retrieve_status(std::move(task));

				        auto s = co_await retrieve_status(task);

				        co_return make_status(s);

				    });

				    tm::get_task_status_recursively.set(r, [&ctx] (std::unique_ptr<http::request> req) -> future<json::json_return_type> {

				        auto& _ctx = ctx;

				        auto id = tasks::task_id{utils::UUID{req->param["task_id"]}};

				        std::queue<tasks::task_manager::foreign_task_ptr> q;

				        utils::chunked_vector<full_task_status> res;

				        // Get requested task.

				        auto task = co_await tasks::task_manager::invoke_on_task(_ctx.tm, id, std::function([] (tasks::task_manager::task_ptr task) -> future<tasks::task_manager::foreign_task_ptr> {

				            auto state = task->get_status().state;

				            if (state == tasks::task_manager::task_state::done || state == tasks::task_manager::task_state::failed) {

				                task->unregister_task();

				            }

				            co_return task;

				        }));

				        // Push children's statuses in BFS order.

				        q.push(co_await task.copy());   // Task cannot be moved since we need it to be alive during whole loop execution.

				        while (!q.empty()) {

				            auto& current = q.front();

				            res.push_back(co_await retrieve_status(current));

				            for (size_t i = 0; i < current->get_children().size(); ++i) {

				                q.push(co_await current->get_children()[i].copy());

				            }

				            q.pop();

				        }

				        std::function<future<>(output_stream<char>&&)> f = [r = std::move(res)] (output_stream<char>&& os) -> future<> {

				            auto s = std::move(os);

				            auto res = std::move(r);

				            co_await s.write("[");

				            std::string delim = "";

				            for (auto& status: res) {

				                co_await s.write(std::exchange(delim, ", "));

				                co_await formatter::write(s, make_status(status));

				            }

				            co_await s.write("]");

				            co_await s.close();

				        };

				        co_return f;

				    });

				    tm::get_and_update_ttl.set(r, [&cfg] (std::unique_ptr<http::request> req) -> future<json::json_return_type> {

				        uint32_t ttl = cfg.task_ttl_seconds();

				        co_await cfg.task_ttl_seconds.set_value_on_all_shards(req->query_parameters["ttl"], utils::config_file::config_source::API);

				        co_return json::json_return_type(ttl);

				    });

				}

									
										3

api/task_manager.hh
									
												View File
												
				@@ -9,9 +9,10 @@

				#pragma once

				#include "api.hh"

				#include "db/config.hh"

				namespace api {

				void set_task_manager(http_context& ctx, routes& r);

				void set_task_manager(http_context& ctx, httpd::routes& r, db::config& cfg);

				}

									
										25

api/task_manager_test.cc
									
												View File
												
				@@ -18,9 +18,10 @@ namespace api {

				namespace tmt = httpd::task_manager_test_json;

				using namespace json;

				using namespace seastar::httpd;

				void set_task_manager_test(http_context& ctx, routes& r, db::config& cfg) {

				    tmt::register_test_module.set(r, [&ctx] (std::unique_ptr<request> req) -> future<json::json_return_type> {

				void set_task_manager_test(http_context& ctx, routes& r) {

				    tmt::register_test_module.set(r, [&ctx] (std::unique_ptr<http::request> req) -> future<json::json_return_type> {

				        co_await ctx.tm.invoke_on_all([] (tasks::task_manager& tm) {

				            auto m = make_shared<tasks::test_module>(tm);

				            tm.register_module("test", m);

				@@ -28,7 +29,7 @@ void set_task_manager_test(http_context& ctx, routes& r, db::config& cfg) {

				        co_return json_void();

				    });

				    tmt::unregister_test_module.set(r, [&ctx] (std::unique_ptr<request> req) -> future<json::json_return_type> {

				    tmt::unregister_test_module.set(r, [&ctx] (std::unique_ptr<http::request> req) -> future<json::json_return_type> {

				        co_await ctx.tm.invoke_on_all([] (tasks::task_manager& tm) -> future<> {

				            auto module_name = "test";

				            auto module = tm.find_module(module_name);

				@@ -37,7 +38,7 @@ void set_task_manager_test(http_context& ctx, routes& r, db::config& cfg) {

				        co_return json_void();

				    });

				    tmt::register_test_task.set(r, [&ctx] (std::unique_ptr<request> req) -> future<json::json_return_type> {

				    tmt::register_test_task.set(r, [&ctx] (std::unique_ptr<http::request> req) -> future<json::json_return_type> {

				        sharded<tasks::task_manager>& tms = ctx.tm;

				        auto it = req->query_parameters.find("task_id");

				        auto id = it != req->query_parameters.end() ? tasks::task_id{utils::UUID{it->second}} : tasks::task_id::create_null_id();

				@@ -47,12 +48,10 @@ void set_task_manager_test(http_context& ctx, routes& r, db::config& cfg) {

				        std::string keyspace = it != req->query_parameters.end() ? it->second : "";

				        it = req->query_parameters.find("table");

				        std::string table = it != req->query_parameters.end() ? it->second : "";

				        it = req->query_parameters.find("type");

				        std::string type = it != req->query_parameters.end() ? it->second : "";

				        it = req->query_parameters.find("entity");

				        std::string entity = it != req->query_parameters.end() ? it->second : "";

				        it = req->query_parameters.find("parent_id");

				        tasks::task_manager::parent_data data;

				        tasks::task_info data;

				        if (it != req->query_parameters.end()) {

				            data.id = tasks::task_id{utils::UUID{it->second}};

				            auto parent_ptr = co_await tasks::task_manager::lookup_task_on_all_shards(ctx.tm, data.id);

				@@ -60,7 +59,7 @@ void set_task_manager_test(http_context& ctx, routes& r, db::config& cfg) {

				        }

				        auto module = tms.local().find_module("test");

				        id = co_await module->make_task<tasks::test_task_impl>(shard, id, keyspace, table, type, entity, data);

				        id = co_await module->make_task<tasks::test_task_impl>(shard, id, keyspace, table, entity, data);

				        co_await tms.invoke_on(shard, [id] (tasks::task_manager& tm) {

				            auto it = tm.get_all_tasks().find(id);

				            if (it != tm.get_all_tasks().end()) {

				@@ -70,7 +69,7 @@ void set_task_manager_test(http_context& ctx, routes& r, db::config& cfg) {

				        co_return id.to_sstring();

				    });

				    tmt::unregister_test_task.set(r, [&ctx] (std::unique_ptr<request> req) -> future<json::json_return_type> {

				    tmt::unregister_test_task.set(r, [&ctx] (std::unique_ptr<http::request> req) -> future<json::json_return_type> {

				        auto id = tasks::task_id{utils::UUID{req->query_parameters["task_id"]}};

				        co_await tasks::task_manager::invoke_on_task(ctx.tm, id, [] (tasks::task_manager::task_ptr task) -> future<> {

				            tasks::test_task test_task{task};

				@@ -79,7 +78,7 @@ void set_task_manager_test(http_context& ctx, routes& r, db::config& cfg) {

				        co_return json_void();

				    });

				    tmt::finish_test_task.set(r, [&ctx] (std::unique_ptr<request> req) -> future<json::json_return_type> {

				    tmt::finish_test_task.set(r, [&ctx] (std::unique_ptr<http::request> req) -> future<json::json_return_type> {

				        auto id = tasks::task_id{utils::UUID{req->param["task_id"]}};

				        auto it = req->query_parameters.find("error");

				        bool fail = it != req->query_parameters.end();

				@@ -96,12 +95,6 @@ void set_task_manager_test(http_context& ctx, routes& r, db::config& cfg) {

				        });

				        co_return json_void();

				    });

				    tmt::get_and_update_ttl.set(r, [&ctx, &cfg] (std::unique_ptr<request> req) -> future<json::json_return_type> {

				        uint32_t ttl = cfg.task_ttl_seconds();

				        cfg.task_ttl_seconds.set(boost::lexical_cast<uint32_t>(req->query_parameters["ttl"]));

				        co_return json::json_return_type(ttl);

				    });

				}

				}

									
										3

api/task_manager_test.hh
									
												View File
												
				@@ -11,11 +11,10 @@

				#pragma once

				#include "api.hh"

				#include "db/config.hh"

				namespace api {

				void set_task_manager_test(http_context& ctx, routes& r, db::config& cfg);

				void set_task_manager_test(http_context& ctx, httpd::routes& r);

				}

									
										35

auth/CMakeLists.txt
									
										Normal file
									
												View File
												
				@@ -0,0 +1,35 @@

				include(add_whole_archive)

				add_library(scylla_auth STATIC)

				target_sources(scylla_auth

				  PRIVATE

				    allow_all_authenticator.cc

				    allow_all_authorizer.cc

				    authenticated_user.cc

				    authenticator.cc

				    common.cc

				    default_authorizer.cc

				    password_authenticator.cc

				    passwords.cc

				    permission.cc

				    permissions_cache.cc

				    resource.cc

				    role_or_anonymous.cc

				    roles-metadata.cc

				    sasl_challenge.cc

				    service.cc

				    standard_role_manager.cc

				    transitional.cc)

				target_include_directories(scylla_auth

				  PUBLIC

				    ${CMAKE_SOURCE_DIR})

				target_link_libraries(scylla_auth

				  PUBLIC

				    Seastar::seastar

				    xxHash::xxhash

				  PRIVATE

				    cql3

				    idl

				    wasmtime_bindings)

				add_whole_archive(auth scylla_auth)

									
										12

auth/authenticated_user.cc
									
												View File
												
				@@ -10,24 +10,12 @@

				#include "auth/authenticated_user.hh"

				#include <iostream>

				namespace auth {

				authenticated_user::authenticated_user(std::string_view name)

				        : name(sstring(name)) {

				}

				std::ostream& operator<<(std::ostream& os, const authenticated_user& u) {

				    if (!u.name) {

				        os << "anonymous";

				    } else {

				        os << *u.name;

				    }

				    return os;

				}

				static const authenticated_user the_anonymous_user{};

				const authenticated_user& anonymous_user() noexcept {

									
										21

auth/authenticated_user.hh
									
												View File
												
				@@ -12,7 +12,6 @@

				#include <string_view>

				#include <functional>

				#include <iosfwd>

				#include <optional>

				#include <seastar/core/sstring.hh>

				@@ -38,11 +37,6 @@ public:

				    explicit authenticated_user(std::string_view name);

				};

				///

				/// The user name, or "anonymous".

				///

				std::ostream& operator<<(std::ostream&, const authenticated_user&);

				inline bool operator==(const authenticated_user& u1, const authenticated_user& u2) noexcept {

				    return u1.name == u2.name;

				}

				@@ -59,6 +53,21 @@ inline bool is_anonymous(const authenticated_user& u) noexcept {

				}

				///

				/// The user name, or "anonymous".

				///

				template <>

				struct fmt::formatter<auth::authenticated_user> : fmt::formatter<std::string_view> {

				    template <typename FormatContext>

				    auto format(const auth::authenticated_user& u, FormatContext& ctx) const {

				        if (u.name) {

				            return fmt::format_to(ctx.out(), "{}", *u.name);

				        } else {

				            return fmt::format_to(ctx.out(), "{}", "anonymous");

				        }

				    }

				};

				namespace std {

				template <>

									
										24

auth/authentication_options.cc
									
												View File
											
				@@ -1,24 +0,0 @@

				/*

				 * Copyright (C) 2018-present ScyllaDB

				 */

				/*

				 * SPDX-License-Identifier: AGPL-3.0-or-later

				 */

				#include "auth/authentication_options.hh"

				#include <iostream>

				namespace auth {

				std::ostream& operator<<(std::ostream& os, authentication_option a) {

				    switch (a) {

				        case authentication_option::password: os << "PASSWORD"; break;

				        case authentication_option::options: os << "OPTIONS"; break;

				    }

				    return os;

				}

				}

									
										17

auth/authentication_options.hh
									
												View File
												
				@@ -26,8 +26,6 @@ enum class authentication_option {

				    options

				};

				std::ostream& operator<<(std::ostream&, authentication_option);

				using authentication_option_set = std::unordered_set<authentication_option>;

				using custom_options = std::unordered_map<sstring, sstring>;

				@@ -49,3 +47,18 @@ public:

				};

				}

				template <>

				struct fmt::formatter<auth::authentication_option> : fmt::formatter<std::string_view> {

				    template <typename FormatContext>

				    auto format(const auth::authentication_option a, FormatContext& ctx) const {

				        using enum auth::authentication_option;

				        switch (a) {

				        case password:

				            return formatter<std::string_view>::format("PASSWORD", ctx);

				        case options:

				            return formatter<std::string_view>::format("OPTIONS", ctx);

				        }

				        std::abort();

				    }

				};

									
										2

auth/common.cc
									
												View File
												
				@@ -14,7 +14,7 @@

				#include "cql3/query_processor.hh"

				#include "cql3/statements/create_table_statement.hh"

				#include "replica/database.hh"

				#include "schema_builder.hh"

				#include "schema/schema_builder.hh"

				#include "service/migration_manager.hh"

				#include "timeout_config.hh"

									
										2

auth/common.hh
									
												View File
												
				@@ -30,8 +30,6 @@ namespace replica {

				class database;

				}

				class timeout_config;

				namespace service {

				class migration_manager;

				}

									
										2

auth/default_authorizer.cc
									
												View File
												
				@@ -74,7 +74,7 @@ future<bool> default_authorizer::any_granted() const {

				            query,

				            db::consistency_level::LOCAL_ONE,

				            {},

				            cql3::query_processor::cache_internal::yes).then([this](::shared_ptr<cql3::untyped_result_set> results) {

				            cql3::query_processor::cache_internal::yes).then([](::shared_ptr<cql3::untyped_result_set> results) {

				        return !results->empty();

				    });

				}

									
										2

auth/passwords.cc
									
												View File
												
				@@ -18,7 +18,7 @@ extern "C" {

				namespace auth::passwords {

				static thread_local crypt_data tlcrypt = { 0, };

				static thread_local crypt_data tlcrypt = {};

				namespace detail {

									
										6

auth/permission.cc
									
												View File
												
				@@ -21,7 +21,8 @@ const auth::permission_set auth::permissions::ALL = auth::permission_set::of<

				        auth::permission::SELECT,

				        auth::permission::MODIFY,

				        auth::permission::AUTHORIZE,

				        auth::permission::DESCRIBE>();

				        auth::permission::DESCRIBE,

				        auth::permission::EXECUTE>();

				const auth::permission_set auth::permissions::NONE;

				@@ -34,7 +35,8 @@ static const std::unordered_map<sstring, auth::permission> permission_names({

				        {"SELECT", auth::permission::SELECT},

				        {"MODIFY", auth::permission::MODIFY},

				        {"AUTHORIZE", auth::permission::AUTHORIZE},

				        {"DESCRIBE", auth::permission::DESCRIBE}});

				        {"DESCRIBE", auth::permission::DESCRIBE},

				        {"EXECUTE", auth::permission::EXECUTE}});

				const sstring& auth::permissions::to_string(permission p) {

				    for (auto& v : permission_names) {

									
										5

auth/permission.hh
									
												View File
												
				@@ -38,6 +38,8 @@ enum class permission {

				    AUTHORIZE, // required for GRANT and REVOKE.

				    DESCRIBE, // required on the root-level role resource to list all roles.

				    // function/aggregate/procedure calls

				    EXECUTE,

				};

				typedef enum_set<

				@@ -51,7 +53,8 @@ typedef enum_set<

				                permission::SELECT,

				                permission::MODIFY,

				                permission::AUTHORIZE,

				                permission::DESCRIBE>> permission_set;

				                permission::DESCRIBE,

				                permission::EXECUTE>> permission_set;

				bool operator<(const permission_set&, const permission_set&);

									
										159

auth/resource.cc
									
												View File
												
				@@ -16,30 +16,26 @@

				#include <boost/algorithm/string/join.hpp>

				#include <boost/algorithm/string/split.hpp>

				#include <boost/algorithm/string/classification.hpp>

				#include "service/storage_proxy.hh"

				#include "data_dictionary/user_types_metadata.hh"

				#include "cql3/util.hh"

				#include "db/marshal/type_parser.hh"

				namespace auth {

				std::ostream& operator<<(std::ostream& os, resource_kind kind) {

				    switch (kind) {

				        case resource_kind::data: os << "data"; break;

				        case resource_kind::role: os << "role"; break;

				        case resource_kind::service_level: os << "service_level"; break;

				    }

				    return os;

				}

				static const std::unordered_map<resource_kind, std::string_view> roots{

				        {resource_kind::data, "data"},

				        {resource_kind::role, "roles"},

				        {resource_kind::service_level, "service_levels"}};

				        {resource_kind::service_level, "service_levels"},

				        {resource_kind::functions, "functions"}};

				static const std::unordered_map<resource_kind, std::size_t> max_parts{

				        {resource_kind::data, 2},

				        {resource_kind::role, 1},

				        {resource_kind::service_level, 0}};

				        {resource_kind::service_level, 0},

				        {resource_kind::functions, 2}};

				static permission_set applicable_permissions(const data_resource_view& dv) {

				    if (dv.table()) {

				@@ -82,6 +78,15 @@ static permission_set applicable_permissions(const service_level_resource_view &

				            permission::AUTHORIZE>();

				}

				static permission_set applicable_permissions(const functions_resource_view& fv) {

				    return permission_set::of<

				            permission::CREATE,

				            permission::ALTER,

				            permission::DROP,

				            permission::AUTHORIZE,

				            permission::EXECUTE>();

				}

				resource::resource(resource_kind kind) : _kind(kind) {

				    _parts.emplace_back(roots.at(kind));

				}

				@@ -106,6 +111,31 @@ resource::resource(role_resource_t, std::string_view role) : resource(resource_k

				resource::resource(service_level_resource_t): resource(resource_kind::service_level) {

				}

				resource::resource(functions_resource_t) : resource(resource_kind::functions) {

				}

				resource::resource(functions_resource_t, std::string_view keyspace) : resource(resource_kind::functions) {

				    _parts.emplace_back(keyspace);

				}

				resource::resource(functions_resource_t, std::string_view keyspace, std::string_view function_signature) : resource(resource_kind::functions) {

				    _parts.emplace_back(keyspace);

				    _parts.emplace_back(function_signature);

				}

				resource::resource(functions_resource_t, std::string_view keyspace, std::string_view function_name, std::vector<::shared_ptr<cql3::cql3_type::raw>> function_args) : resource(resource_kind::functions) {

				    _parts.emplace_back(keyspace);

				    _parts.emplace_back(function_name);

				    if (function_args.empty()) {

				        _parts.emplace_back("");

				        return;

				    }

				    for (auto& arg_type : function_args) {

				        // We can't validate the UDTs here, so we just use the raw cql type names.

				        _parts.emplace_back(arg_type->to_string());

				    }

				}

				sstring resource::name() const {

				    return boost::algorithm::join(_parts, "/");

				}

				@@ -127,6 +157,7 @@ permission_set resource::applicable_permissions() const {

				        case resource_kind::data: ps = ::auth::applicable_permissions(data_resource_view(*this)); break;

				        case resource_kind::role: ps = ::auth::applicable_permissions(role_resource_view(*this)); break;

				        case resource_kind::service_level: ps = ::auth::applicable_permissions(service_level_resource_view(*this)); break;

				        case resource_kind::functions: ps = ::auth::applicable_permissions(functions_resource_view(*this)); break;

				    }

				    return ps;

				@@ -149,6 +180,7 @@ std::ostream& operator<<(std::ostream& os, const resource& r) {

				        case resource_kind::data: return os << data_resource_view(r);

				        case resource_kind::role: return os << role_resource_view(r);

				        case resource_kind::service_level: return os << service_level_resource_view(r);

				        case resource_kind::functions: return os << functions_resource_view(r);

				    }

				    return os;

				@@ -165,6 +197,109 @@ std::ostream &operator<<(std::ostream &os, const service_level_resource_view &v)

				    return os;

				}

				sstring encode_signature(std::string_view name, std::vector<data_type> args) {

				    return format("{}[{}]", name,

				            fmt::join(args | boost::adaptors::transformed([] (const data_type t) {

				                return t->name();

				            }), "^"));

				}

				std::pair<sstring, std::vector<data_type>> decode_signature(std::string_view encoded_signature) {

				    auto name_delim = encoded_signature.find_last_of('[');

				    std::string_view function_name = encoded_signature.substr(0, name_delim);

				    encoded_signature.remove_prefix(name_delim + 1);

				    encoded_signature.remove_suffix(1);

				    if (encoded_signature.empty()) {

				        return {sstring(function_name), {}};

				    }

				    std::vector<std::string_view> raw_types;

				    boost::split(raw_types, encoded_signature, boost::is_any_of("^"));

				    std::vector<data_type> decoded_types = boost::copy_range<std::vector<data_type>>(

				        raw_types | boost::adaptors::transformed([] (std::string_view raw_type) {

				            return db::marshal::type_parser::parse(raw_type);

				        })

				    );

				    return {sstring(function_name), decoded_types};

				}

				// Purely for Cassandra compatibility, types in the function signature are

				// decoded from their verbose form (org.apache.cassandra.db.marshal.Int32Type)

				// to the short form (int)

				static sstring decoded_signature_string(std::string_view encoded_signature) {

				    auto [function_name, arg_types] = decode_signature(encoded_signature);

				    return format("{}({})", cql3::util::maybe_quote(sstring(function_name)),

				            boost::algorithm::join(arg_types | boost::adaptors::transformed([] (data_type t) {

				                return t->cql3_type_name();

				            }), ", "));

				}

				std::ostream &operator<<(std::ostream &os, const functions_resource_view &v) {

				    const auto keyspace = v.keyspace();

				    const auto function_signature = v.function_signature();

				    const auto name = v.function_name();

				    const auto args = v.function_args();

				    if (!keyspace) {

				        os << "<all functions>";

				    } else if (name) {

				        os << "<function " << *keyspace << '.' << cql3::util::maybe_quote(sstring(*name)) << '(';

				        for (auto arg : *args) {

				            os << arg << ',';

				        }

				        os << ")>";

				    } else if (!function_signature) {

				        os << "<all functions in " << *keyspace << '>';

				    } else {

				        os << "<function " << *keyspace << '.' << decoded_signature_string(*function_signature) << '>';

				    }

				    return os;

				}

				functions_resource_view::functions_resource_view(const resource& r) : _resource(r) {

				    if (r._kind != resource_kind::functions) {

				        throw resource_kind_mismatch(resource_kind::functions, r._kind);

				    }

				}

				std::optional<std::string_view> functions_resource_view::keyspace() const {

				    if (_resource._parts.size() == 1) {

				        return {};

				    }

				    return _resource._parts[1];

				}

				std::optional<std::string_view> functions_resource_view::function_signature() const {

				    if (_resource._parts.size() <= 2 || _resource._parts.size() > 3) {

				        return {};

				    }

				    return _resource._parts[2];

				}

				std::optional<std::string_view> functions_resource_view::function_name() const {

				    if (_resource._parts.size() <= 3) {

				        return {};

				    }

				    return _resource._parts[2];

				}

				std::optional<std::vector<std::string_view>> functions_resource_view::function_args() const {

				    if (_resource._parts.size() <= 3) {

				        return {};

				    }

				    std::vector<std::string_view> parts;

				    if (_resource._parts[3] == "") {

				        return {};

				    }

				    for (size_t i = 3; i < _resource._parts.size(); i++) {

				        parts.push_back(_resource._parts[i]);

				    }

				    return parts;

				}

				data_resource_view::data_resource_view(const resource& r) : _resource(r) {

				    if (r._kind != resource_kind::data) {

				        throw resource_kind_mismatch(resource_kind::data, r._kind);

									
										84

auth/resource.hh
									
												View File
												
				@@ -18,6 +18,7 @@

				#include <vector>

				#include <unordered_set>

				#include <boost/range/adaptor/transformed.hpp>

				#include <seastar/core/print.hh>

				#include <seastar/core/sstring.hh>

				@@ -25,6 +26,7 @@

				#include "seastarx.hh"

				#include "utils/hash.hh"

				#include "utils/small_vector.hh"

				#include "cql3/cql3_type.hh"

				namespace auth {

				@@ -36,11 +38,9 @@ public:

				};

				enum class resource_kind {

				    data, role, service_level

				    data, role, service_level, functions

				};

				std::ostream& operator<<(std::ostream&, resource_kind);

				///

				/// Type tag for constructing data resources.

				///

				@@ -56,10 +56,15 @@ struct role_resource_t final {};

				///

				struct service_level_resource_t final {};

				///

				/// Type tag for constructing function resources.

				///

				struct functions_resource_t final {};

				///

				/// Resources are entities that users can be granted permissions on.

				///

				/// There are data (keyspaces and tables) and role resources. There may be other kinds of resources in the future.

				/// There are data (keyspaces and tables), role and function resources. There may be other kinds of resources in the future.

				///

				/// When they are stored as system metadata, resources have the form `root/part_0/part_1/.../part_n`. Each kind of

				/// resource has a specific root prefix, followed by a maximum of `n` parts (where `n` is distinct for each kind of

				@@ -83,6 +88,11 @@ public:

				    resource(data_resource_t, std::string_view keyspace, std::string_view table);

				    resource(role_resource_t, std::string_view role);

				    resource(service_level_resource_t);

				    explicit resource(functions_resource_t);

				    resource(functions_resource_t, std::string_view keyspace);

				    resource(functions_resource_t, std::string_view keyspace, std::string_view function_signature);

				    resource(functions_resource_t, std::string_view keyspace, std::string_view function_name,

				            std::vector<::shared_ptr<cql3::cql3_type::raw>> function_args);

				    resource_kind kind() const noexcept {

				        return _kind;

				@@ -104,6 +114,7 @@ private:

				    friend class data_resource_view;

				    friend class role_resource_view;

				    friend class service_level_resource_view;

				    friend class functions_resource_view;

				    friend bool operator<(const resource&, const resource&);

				    friend bool operator==(const resource&, const resource&);

				@@ -182,6 +193,25 @@ public:

				std::ostream& operator<<(std::ostream&, const service_level_resource_view&);

				///

				/// A "function" view of \ref resource.

				///

				class functions_resource_view final {

				    const resource& _resource;

				public:

				    ///

				    /// \throws \ref resource_kind_mismatch if the argument is not a "function" resource.

				    ///

				    explicit functions_resource_view(const resource&);

				    std::optional<std::string_view> keyspace() const;

				    std::optional<std::string_view> function_signature() const;

				    std::optional<std::string_view> function_name() const;

				    std::optional<std::vector<std::string_view>> function_args() const;

				};

				std::ostream& operator<<(std::ostream&, const functions_resource_view&);

				///

				/// Parse a resource from its name.

				///

				@@ -210,8 +240,49 @@ inline resource make_service_level_resource() {

				    return resource(service_level_resource_t{});

				}

				const resource& root_function_resource();

				inline resource make_functions_resource() {

				    return resource(functions_resource_t{});

				}

				inline resource make_functions_resource(std::string_view keyspace) {

				    return resource(functions_resource_t{}, keyspace);

				}

				inline resource make_functions_resource(std::string_view keyspace, std::string_view function_signature) {

				    return resource(functions_resource_t{}, keyspace, function_signature);

				}

				inline resource make_functions_resource(std::string_view keyspace, std::string_view function_name, std::vector<::shared_ptr<cql3::cql3_type::raw>> function_signature) {

				    return resource(functions_resource_t{}, keyspace, function_name, function_signature);

				}

				sstring encode_signature(std::string_view name, std::vector<data_type> args);

				std::pair<sstring, std::vector<data_type>> decode_signature(std::string_view encoded_signature);

				}

				template <>

				struct fmt::formatter<auth::resource_kind> : fmt::formatter<std::string_view> {

				    template <typename FormatContext>

				    auto format(const auth::resource_kind kind, FormatContext& ctx) const {

				        using enum auth::resource_kind;

				        switch (kind) {

				        case data:

				            return formatter<std::string_view>::format("data", ctx);

				        case role:

				            return formatter<std::string_view>::format("role", ctx);

				        case service_level:

				            return formatter<std::string_view>::format("service_level", ctx);

				        case functions:

				            return formatter<std::string_view>::format("functions", ctx);

				        }

				        std::abort();

				    }

				};

				namespace std {

				template <>

				@@ -228,6 +299,10 @@ struct hash<auth::resource> {

				            return utils::tuple_hash()(std::make_tuple(auth::resource_kind::service_level));

				    }

				    static size_t hash_function(const auth::functions_resource_view& fv) {

				        return utils::tuple_hash()(std::make_tuple(auth::resource_kind::functions, fv.keyspace(), fv.function_signature()));

				    }

				    size_t operator()(const auth::resource& r) const {

				        std::size_t value;

				@@ -235,6 +310,7 @@ struct hash<auth::resource> {

				        case auth::resource_kind::data: value = hash_data(auth::data_resource_view(r)); break;

				        case auth::resource_kind::role: value = hash_role(auth::role_resource_view(r)); break;

				        case auth::resource_kind::service_level: value = hash_service_level(auth::service_level_resource_view(r)); break;

				        case auth::resource_kind::functions: value = hash_function(auth::functions_resource_view(r)); break;

				        }

				        return value;

									
										20

auth/service.cc
									
												View File
												
				@@ -20,17 +20,19 @@

				#include "auth/allow_all_authorizer.hh"

				#include "auth/common.hh"

				#include "auth/role_or_anonymous.hh"

				#include "cql3/functions/functions.hh"

				#include "cql3/query_processor.hh"

				#include "cql3/untyped_result_set.hh"

				#include "db/config.hh"

				#include "db/consistency_level_type.hh"

				#include "db/functions/function_name.hh"

				#include "exceptions/exceptions.hh"

				#include "log.hh"

				#include "service/migration_manager.hh"

				#include "utils/class_registrator.hh"

				#include "locator/abstract_replication_strategy.hh"

				#include "data_dictionary/keyspace_metadata.hh"

				#include "mutation.hh"

				#include "mutation/mutation.hh"

				namespace auth {

				@@ -346,6 +348,22 @@ future<bool> service::exists(const resource& r) const {

				        }

				        case resource_kind::service_level:

				            return make_ready_future<bool>(true);

				        case resource_kind::functions: {

				            const auto& db = _qp.db();

				            functions_resource_view v(r);

				            const auto keyspace = v.keyspace();

				            if (!keyspace) {

				                return make_ready_future<bool>(true);

				            }

				            const auto function_signature = v.function_signature();

				            if (!function_signature) {

				                return make_ready_future<bool>(db.has_keyspace(sstring(*keyspace)));

				            }

				            auto [name, function_args] = auth::decode_signature(*function_signature);

				            return make_ready_future<bool>(cql3::functions::functions::find(db::functions::function_name{sstring(*keyspace), name}, function_args));

				        }

				    }

				    return make_ready_future<bool>(false);

									
										2

auth/standard_role_manager.cc
									
												View File
												
				@@ -470,7 +470,7 @@ standard_role_manager::grant(std::string_view grantee_name, std::string_view rol

				future<>

				standard_role_manager::revoke(std::string_view revokee_name, std::string_view role_name) {

				    return this->exists(role_name).then([this, revokee_name, role_name](bool role_exists) {

				    return this->exists(role_name).then([role_name](bool role_exists) {

				        if (!role_exists) {

				            throw nonexistant_role(sstring(role_name));

				        }

									
										2

build_mode.hh
									
												View File
												
				@@ -14,7 +14,7 @@

				#endif

				#ifndef STRINGIFY

				// We need to levels of indirection

				// We need two levels of indirection

				// to make a string out of the macro name.

				// The outer level expands the macro

				// and the inner level makes a string out of the expanded macro.

									
										22

bytes.cc
									
												View File
												
				@@ -50,15 +50,7 @@ bytes from_hex(sstring_view s) {

				}

				sstring to_hex(bytes_view b) {

				    static char digits[] = "0123456789abcdef";

				    sstring out = uninitialized_string(b.size() * 2);

				    unsigned end = b.size();

				    for (unsigned i = 0; i != end; ++i) {

				        uint8_t x = b[i];

				        out[2*i] = digits[x >> 4];

				        out[2*i+1] = digits[x & 0xf];

				    }

				    return out;

				    return fmt::to_string(fmt_hex(b));

				}

				sstring to_hex(const bytes& b) {

				@@ -70,12 +62,14 @@ sstring to_hex(const bytes_opt& b) {

				}

				std::ostream& operator<<(std::ostream& os, const bytes& b) {

				    return os << to_hex(b);

				    fmt::print(os, "{}", b);

				    return os;

				}

				std::ostream& operator<<(std::ostream& os, const bytes_opt& b) {

				    if (b) {

				        return os << *b;

				        fmt::print(os, "{}", *b);

				        return os;

				    }

				    return os << "null";

				}

				@@ -83,11 +77,13 @@ std::ostream& operator<<(std::ostream& os, const bytes_opt& b) {

				namespace std {

				std::ostream& operator<<(std::ostream& os, const bytes_view& b) {

				    return os << to_hex(b);

				    fmt::print(os, "{}", fmt_hex(b));

				    return os;

				}

				}

				std::ostream& operator<<(std::ostream& os, const fmt_hex& b) {

				    return os << to_hex(b.v);

				    fmt::print(os, "{}", b);

				    return os;

				}

									
										90

bytes.hh
									
												View File
												
				@@ -9,8 +9,9 @@

				#pragma once

				#include "seastarx.hh"

				#include <fmt/format.h>

				#include <seastar/core/sstring.hh>

				#include "hashing.hh"

				#include "utils/hashing.hh"

				#include <optional>

				#include <iosfwd>

				#include <functional>

				@@ -37,8 +38,8 @@ inline bytes_view to_bytes_view(sstring_view view) {

				}

				struct fmt_hex {

				    bytes_view& v;

				    fmt_hex(bytes_view& v) noexcept : v(v) {}

				    const bytes_view& v;

				    fmt_hex(const bytes_view& v) noexcept : v(v) {}

				};

				std::ostream& operator<<(std::ostream& os, const fmt_hex& hex);

				@@ -51,6 +52,89 @@ sstring to_hex(const bytes_opt& b);

				std::ostream& operator<<(std::ostream& os, const bytes& b);

				std::ostream& operator<<(std::ostream& os, const bytes_opt& b);

				template <>

				struct fmt::formatter<fmt_hex> {

				    size_t _group_size_in_bytes = 0;

				    char _delimiter = ' ';

				public:

				    // format_spec := [group_size[delimeter]]

				    // group_size := a char from '0' to '9'

				    // delimeter := a char other than '{'  or '}'

				    //

				    // by default, the given bytes are printed without delimeter, just

				    // like a string. so a string view of {0x20, 0x01, 0x0d, 0xb8} is

				    // printed like:

				    // "20010db8".

				    //

				    // but the format specifier can be used to customize how the bytes

				    // are printed. for instance, to print an bytes_view like IPv6. so

				    // the format specfier would be "{:2:}", where

				    // - "2": bytes are printed in groups of 2 bytes

				    // - ":": each group is delimeted by ":"

				    // and the formatted output will look like:

				    // "2001:0db8:0000"

				    //

				    // or we can mimic how the default format of used by hexdump using

				    // "{:2 }", where

				    // - "2": bytes are printed in group of 2 bytes

				    // - " ": each group is delimeted by " "

				    // and the formatted output will look like:

				    // "2001 0db8 0000"

				    //

				    // or we can just print each bytes and separate them by a dash using

				    // "{:1-}"

				    // and the formatted output will look like:

				    // "20-01-0b-b8-00-00"

				    constexpr auto parse(fmt::format_parse_context& ctx) {

				        // get the delimeter if any

				        auto it = ctx.begin();

				        auto end = ctx.end();

				        if (it != end) {

				            int group_size = *it++ - '0';

				            if (group_size < 0 ||

				                static_cast<size_t>(group_size) > sizeof(uint64_t)) {

				                throw format_error("invalid group_size");

				            }

				            _group_size_in_bytes = group_size;

				            if (it != end) {

				                // optional delimiter

				                _delimiter = *it++;

				            }

				        }

				        if (it != end && *it != '}') {

				            throw format_error("invalid format");

				        }

				        return it;

				    }

				    template <typename FormatContext>

				    auto format(const ::fmt_hex& s, FormatContext& ctx) const {

				        auto out = ctx.out();

				        const auto& v = s.v;

				        if (_group_size_in_bytes > 0) {

				            for (size_t i = 0, size = v.size(); i < size; i++) {

				                if (i != 0 && i % _group_size_in_bytes == 0) {

				                    fmt::format_to(out, "{}{:02x}", _delimiter, std::byte(v[i]));

				                } else {

				                    fmt::format_to(out, "{:02x}", std::byte(v[i]));

				                }

				            }

				        } else {

				            for (auto b : v) {

				                fmt::format_to(out, "{:02x}", std::byte(b));

				            }

				        }

				        return out;

				    }

				};

				template <>

				struct fmt::formatter<bytes> : fmt::formatter<fmt_hex> {

				    template <typename FormatContext>

				    auto format(const ::bytes& s, FormatContext& ctx) const {

				        return fmt::formatter<::fmt_hex>::format(::fmt_hex(bytes_view(s)), ctx);

				    }

				};

				namespace std {

				// Must be in std:: namespace, or ADL fails

									
										6

bytes_ostream.hh
									
												View File
												
				@@ -12,7 +12,7 @@

				#include "bytes.hh"

				#include "utils/managed_bytes.hh"

				#include "hashing.hh"

				#include "utils/hashing.hh"

				#include <seastar/core/simple-stream.hh>

				#include <seastar/core/loop.hh>

				#include <bit>

				@@ -457,7 +457,9 @@ public:

				            _begin.ptr->size = _size;

				            _current = nullptr;

				            _size = 0;

				            return managed_bytes(std::exchange(_begin.ptr, {}));

				            auto begin_ptr = _begin.ptr;

				            _begin.ptr = nullptr;

				            return managed_bytes(begin_ptr);

				        } else {

				            return managed_bytes();

				        }

									
										485

cache_flat_mutation_reader.hh
									
												View File
												
				@@ -10,10 +10,10 @@

				#include <vector>

				#include "row_cache.hh"

				#include "mutation_fragment.hh"

				#include "mutation/mutation_fragment.hh"

				#include "query-request.hh"

				#include "partition_snapshot_row_cursor.hh"

				#include "range_tombstone_assembler.hh"

				#include "mutation/range_tombstone_assembler.hh"

				#include "read_context.hh"

				#include "readers/delegating_v2.hh"

				#include "clustering_key_filter.hh"

				@@ -41,7 +41,7 @@ class cache_flat_mutation_reader final : public flat_mutation_reader_v2::impl {

				        move_to_underlying,

				        // Invariants:

				        // - Upper bound of the read is min(_next_row.position(), _upper_bound)

				        // - Upper bound of the read is *_underlying_upper_bound

				        // - _next_row_in_range = _next.position() < _upper_bound

				        // - _last_row points at a direct predecessor of the next row which is going to be read.

				        //   Used for populating continuity.

				@@ -51,46 +51,6 @@ class cache_flat_mutation_reader final : public flat_mutation_reader_v2::impl {

				        end_of_stream

				    };

				    enum class source {

				        cache = 0,

				        underlying = 1,

				    };

				    // Merges range tombstone change streams coming from underlying and the cache.

				    // Ensures no range tombstone change fragment is emitted when there is no

				    // actual change in the effective tombstone.

				    class range_tombstone_change_merger {

				        const schema& _schema;

				        position_in_partition _pos;

				        tombstone _current_tombstone;

				        std::array<tombstone, 2> _tombstones;

				    private:

				        std::optional<range_tombstone_change> do_flush(position_in_partition pos, bool end_of_range) {

				            std::optional<range_tombstone_change> ret;

				            position_in_partition::tri_compare cmp(_schema);

				            const auto res = cmp(_pos, pos);

				            const auto should_flush = end_of_range ? res <= 0 : res < 0;

				            if (should_flush) {

				                auto merged_tomb = std::max(_tombstones.front(), _tombstones.back());

				                if (merged_tomb != _current_tombstone) {

				                    _current_tombstone = merged_tomb;

				                    ret.emplace(_pos, _current_tombstone);

				                }

				                _pos = std::move(pos);

				            }

				            return ret;

				        }

				    public:

				        range_tombstone_change_merger(const schema& s) : _schema(s), _pos(position_in_partition::before_all_clustered_rows()), _tombstones{}

				        { }

				        std::optional<range_tombstone_change> apply(source src, range_tombstone_change&& rtc) {

				            auto ret = do_flush(rtc.position(), false);

				            _tombstones[static_cast<size_t>(src)] = rtc.tombstone();

				            return ret;

				        }

				        std::optional<range_tombstone_change> flush(position_in_partition_view pos, bool end_of_range) {

				            return do_flush(position_in_partition(pos), end_of_range);

				        }

				    };

				    partition_snapshot_ptr _snp;

				    query::clustering_key_filter_ranges _ck_ranges; // Query schema domain, reversed reads use native order

				@@ -103,8 +63,11 @@ class cache_flat_mutation_reader final : public flat_mutation_reader_v2::impl {

				    // Holds the lower bound of a position range which hasn't been processed yet.

				    // Only rows with positions < _lower_bound have been emitted, and only

				    // range_tombstones with positions <= _lower_bound.

				    // range_tombstone_changes with positions <= _lower_bound.

				    //

				    // Invariant: !_lower_bound.is_clustering_row()

				    position_in_partition _lower_bound; // Query schema domain

				    // Invariant: !_upper_bound.is_clustering_row()

				    position_in_partition_view _upper_bound; // Query schema domain

				    std::optional<position_in_partition> _underlying_upper_bound; // Query schema domain

				@@ -121,22 +84,19 @@ class cache_flat_mutation_reader final : public flat_mutation_reader_v2::impl {

				    read_context& _read_context;

				    partition_snapshot_row_cursor _next_row;

				    range_tombstone_change_generator _rt_gen; // cache -> reader

				    range_tombstone_assembler _rt_assembler; // underlying -> cache

				    range_tombstone_change_merger _rt_merger; // {cache, underlying} -> reader

				    // When the read moves to the underlying, the read range will be

				    // (_lower_bound, x], where x is either _next_row.position() or _upper_bound.

				    // In the former case (x is _next_row.position()), underlying can emit

				    // a range tombstone change for after_key(x), which is outside the range.

				    // We can't push this fragment into the buffer straight away, the cache may

				    // have fragments with smaller position. So we save it here and flush it when

				    // a fragment with a larger position is seen.

				    std::optional<mutation_fragment_v2> _queued_underlying_fragment;

				    // Holds the currently active range tombstone of the output mutation fragment stream.

				    // While producing the stream, at any given time, _current_tombstone applies to the

				    // key range which extends at least to _lower_bound. When consuming subsequent interval,

				    // which will advance _lower_bound further, be it from underlying or from cache,

				    // a decision is made whether the range tombstone in the next interval is the same as

				    // the current one or not. If it is different, then range_tombstone_change is emitted

				    // with the old _lower_bound value (start of the next interval).

				    tombstone _current_tombstone;

				    state _state = state::before_static_row;

				    bool _next_row_in_range = false;

				    bool _has_rt = false;

				    // True iff current population interval, since the previous clustering row, starts before all clustered rows.

				    // We cannot just look at _lower_bound, because emission of range tombstones changes _lower_bound and

				@@ -145,11 +105,6 @@ class cache_flat_mutation_reader final : public flat_mutation_reader_v2::impl {

				    // Valid when _state == reading_from_underlying.

				    bool _population_range_starts_before_all_rows;

				    // Whether _lower_bound was changed within current fill_buffer().

				    // If it did not then we cannot break out of it (e.g. on preemption) because

				    // forward progress is not guaranteed in case iterators are getting constantly invalidated.

				    bool _lower_bound_changed = false;

				    // Points to the underlying reader conforming to _schema,

				    // either to *_underlying_holder or _read_context.underlying().underlying().

				    flat_mutation_reader_v2* _underlying = nullptr;

				@@ -163,14 +118,11 @@ class cache_flat_mutation_reader final : public flat_mutation_reader_v2::impl {

				    void move_to_next_range();

				    void move_to_range(query::clustering_row_ranges::const_iterator);

				    void move_to_next_entry();

				    void maybe_drop_last_entry() noexcept;

				    void flush_tombstones(position_in_partition_view, bool end_of_range = false);

				    void maybe_drop_last_entry(tombstone) noexcept;

				    void add_to_buffer(const partition_snapshot_row_cursor&);

				    void add_clustering_row_to_buffer(mutation_fragment_v2&&);

				    void add_to_buffer(range_tombstone_change&&, source);

				    void do_add_to_buffer(range_tombstone_change&&);

				    void add_range_tombstone_to_buffer(range_tombstone&&);

				    void add_to_buffer(mutation_fragment_v2&&);

				    void add_to_buffer(range_tombstone_change&&);

				    void offer_from_underlying(mutation_fragment_v2&&);

				    future<> read_from_underlying();

				    void start_reading_from_underlying();

				    bool after_current_range(position_in_partition_view position);

				@@ -189,7 +141,7 @@ class cache_flat_mutation_reader final : public flat_mutation_reader_v2::impl {

				    bool ensure_population_lower_bound();

				    void maybe_add_to_cache(const mutation_fragment_v2& mf);

				    void maybe_add_to_cache(const clustering_row& cr);

				    void maybe_add_to_cache(const range_tombstone_change& rtc);

				    bool maybe_add_to_cache(const range_tombstone_change& rtc);

				    void maybe_add_to_cache(const static_row& sr);

				    void maybe_set_static_row_continuous();

				    void finish_reader() {

				@@ -244,8 +196,6 @@ public:

				        , _read_context_holder()

				        , _read_context(ctx)    // ctx is owned by the caller, who's responsible for closing it.

				        , _next_row(*_schema, *_snp, false, _read_context.is_reversed())

				        , _rt_gen(*_schema)

				        , _rt_merger(*_schema)

				    {

				        clogger.trace("csm {}: table={}.{}, reversed={}, snap={}", fmt::ptr(this), _schema->ks_name(), _schema->cf_name(), _read_context.is_reversed(),

				                      fmt::ptr(&*_snp));

				@@ -373,13 +323,31 @@ future<> cache_flat_mutation_reader::do_fill_buffer() {

				        }

				        _state = state::reading_from_underlying;

				        _population_range_starts_before_all_rows = _lower_bound.is_before_all_clustered_rows(*_schema) && !_read_context.is_reversed();

				        _underlying_upper_bound = _next_row_in_range ? position_in_partition::before_key(_next_row.position())

				                                                     : position_in_partition(_upper_bound);

				        if (!_read_context.partition_exists()) {

				            clogger.trace("csm {}: partition does not exist", fmt::ptr(this));

				            if (_current_tombstone) {

				                clogger.trace("csm {}: move_to_underlying: emit rtc({}, null)", fmt::ptr(this), _lower_bound);

				                push_mutation_fragment(mutation_fragment_v2(*_schema, _permit, range_tombstone_change(_lower_bound, {})));

				                _current_tombstone = {};

				            }

				            return read_from_underlying();

				        }

				        _underlying_upper_bound = _next_row_in_range ? position_in_partition(_next_row.position())

				                                      : position_in_partition(_upper_bound);

				        return _underlying->fast_forward_to(position_range{_lower_bound, *_underlying_upper_bound}).then([this] {

				            return read_from_underlying();

				            if (!_current_tombstone) {

				                return read_from_underlying();

				            }

				            return _underlying->peek().then([this] (mutation_fragment_v2* mf) {

				                position_in_partition::equal_compare eq(*_schema);

				                if (!mf || !mf->is_range_tombstone_change()

				                        || !eq(mf->as_range_tombstone_change().position(), _lower_bound)) {

				                    clogger.trace("csm {}: move_to_underlying: emit rtc({}, null)", fmt::ptr(this), _lower_bound);

				                    push_mutation_fragment(mutation_fragment_v2(*_schema, _permit, range_tombstone_change(_lower_bound, {})));

				                    _current_tombstone = {};

				                }

				                return read_from_underlying();

				            });

				        });

				    }

				    if (_state == state::reading_from_underlying) {

				@@ -388,8 +356,8 @@ future<> cache_flat_mutation_reader::do_fill_buffer() {

				    // assert(_state == state::reading_from_cache)

				    return _lsa_manager.run_in_read_section([this] {

				        auto next_valid = _next_row.iterators_valid();

				        clogger.trace("csm {}: reading_from_cache, range=[{}, {}), next={}, valid={}", fmt::ptr(this), _lower_bound,

				            _upper_bound, _next_row.position(), next_valid);

				        clogger.trace("csm {}: reading_from_cache, range=[{}, {}), next={}, valid={}, rt={}", fmt::ptr(this), _lower_bound,

				            _upper_bound, _next_row.position(), next_valid, _current_tombstone);

				        // We assume that if there was eviction, and thus the range may

				        // no longer be continuous, the cursor was invalidated.

				        if (!next_valid) {

				@@ -403,13 +371,9 @@ future<> cache_flat_mutation_reader::do_fill_buffer() {

				        }

				        _next_row.maybe_refresh();

				        clogger.trace("csm {}: next={}", fmt::ptr(this), _next_row);

				        _lower_bound_changed = false;

				        while (_state == state::reading_from_cache) {

				            copy_from_cache_to_buffer();

				            // We need to check _lower_bound_changed even if is_buffer_full() because

				            // we may have emitted only a range tombstone which overlapped with _lower_bound

				            // and thus didn't cause _lower_bound to change.

				            if ((need_preempt() || is_buffer_full()) && _lower_bound_changed) {

				            if (need_preempt() || is_buffer_full()) {

				                break;

				            }

				        }

				@@ -423,37 +387,38 @@ future<> cache_flat_mutation_reader::read_from_underlying() {

				        [this] { return _state != state::reading_from_underlying || is_buffer_full(); },

				        [this] (mutation_fragment_v2 mf) {

				            _read_context.cache().on_row_miss();

				            maybe_add_to_cache(mf);

				            add_to_buffer(std::move(mf));

				            offer_from_underlying(std::move(mf));

				        },

				        [this] {

				            _lower_bound = std::move(*_underlying_upper_bound);

				            _underlying_upper_bound.reset();

				            _state = state::reading_from_cache;

				            _lsa_manager.run_in_update_section([this] {

				                auto same_pos = _next_row.maybe_refresh();

				                clogger.trace("csm {}: underlying done, in_range={}, same={}, next={}", fmt::ptr(this), _next_row_in_range, same_pos, _next_row);

				                if (!same_pos) {

				                    _read_context.cache().on_mispopulate(); // FIXME: Insert dummy entry at _upper_bound.

				                    _read_context.cache().on_mispopulate(); // FIXME: Insert dummy entry at _lower_bound.

				                    _next_row_in_range = !after_current_range(_next_row.position());

				                    if (!_next_row.continuous()) {

				                        _last_row = nullptr; // We did not populate the full range up to _lower_bound, break continuity

				                        start_reading_from_underlying();

				                    }

				                    return;

				                }

				                if (_next_row_in_range) {

				                    maybe_update_continuity();

				                    if (!_next_row.dummy()) {

				                        _lower_bound = position_in_partition::before_key(_next_row.key());

				                    } else {

				                        _lower_bound = _next_row.position();

				                    }

				                } else {

				                    if (no_clustering_row_between(*_schema, _upper_bound, _next_row.position())) {

				                        this->maybe_update_continuity();

				                    } else if (can_populate()) {

				                    if (can_populate()) {

				                        const schema& table_s = table_schema();

				                        rows_entry::tri_compare cmp(table_s);

				                        auto& rows = _snp->version()->partition().mutable_clustered_rows();

				                        if (query::is_single_row(*_schema, *_ck_ranges_curr)) {

				                            // If there are range tombstones which apply to the row then

				                            // we cannot insert an empty entry here because if those range

				                            // tombstones got evicted by now, we will insert an entry

				                            // with missing range tombstone information.

				                            // FIXME: try to set the range tombstone when possible.

				                            if (!_has_rt) {

				                            with_allocator(_snp->region().allocator(), [&] {

				                                auto e = alloc_strategy_unique_ptr<rows_entry>(

				                                    current_allocator().construct<rows_entry>(_ck_ranges_curr->start()->value()));

				@@ -466,9 +431,10 @@ future<> cache_flat_mutation_reader::read_from_underlying() {

				                                    // Also works in reverse read mode.

				                                    // It preserves the continuity of the range the entry falls into.

				                                    it->set_continuous(next->continuous());

				                                    clogger.trace("csm {}: inserted empty row at {}, cont={}", fmt::ptr(this), it->position(), it->continuous());

				                                    clogger.trace("csm {}: inserted empty row at {}, cont={}, rt={}", fmt::ptr(this), it->position(), it->continuous(), it->range_tombstone());

				                                }

				                            });

				                            }

				                        } else if (ensure_population_lower_bound()) {

				                            with_allocator(_snp->region().allocator(), [&] {

				                                auto e = alloc_strategy_unique_ptr<rows_entry>(

				@@ -476,17 +442,19 @@ future<> cache_flat_mutation_reader::read_from_underlying() {

				                                // Use _next_row iterator only as a hint, because there could be insertions after _upper_bound.

				                                auto insert_result = rows.insert_before_hint(_next_row.get_iterator_in_latest_version(), std::move(e), cmp);

				                                if (insert_result.second) {

				                                    clogger.trace("csm {}: inserted dummy at {}", fmt::ptr(this), _upper_bound);

				                                    clogger.trace("csm {}: L{}: inserted dummy at {}", fmt::ptr(this), __LINE__, _upper_bound);

				                                    _snp->tracker()->insert(*insert_result.first);

				                                }

				                                if (_read_context.is_reversed()) [[unlikely]] {

				                                    clogger.trace("csm {}: set_continuous({})", fmt::ptr(this), _last_row.position());

				                                    clogger.trace("csm {}: set_continuous({}), prev={}, rt={}", fmt::ptr(this), _last_row.position(), insert_result.first->position(), _current_tombstone);

				                                    _last_row->set_continuous(true);

				                                    _last_row->set_range_tombstone(_current_tombstone);

				                                } else {

				                                    clogger.trace("csm {}: set_continuous({})", fmt::ptr(this), insert_result.first->position());

				                                    clogger.trace("csm {}: set_continuous({}), prev={}, rt={}", fmt::ptr(this), insert_result.first->position(), _last_row.position(), _current_tombstone);

				                                    insert_result.first->set_continuous(true);

				                                    insert_result.first->set_range_tombstone(_current_tombstone);

				                                }

				                                maybe_drop_last_entry();

				                                maybe_drop_last_entry(_current_tombstone);

				                            });

				                        }

				                    } else {

				@@ -515,55 +483,103 @@ bool cache_flat_mutation_reader::ensure_population_lower_bound() {

				    // Continuity flag we will later set for the upper bound extends to the previous row in the same version,

				    // so we need to ensure we have an entry in the latest version.

				    if (!_last_row.is_in_latest_version()) {

				        with_allocator(_snp->region().allocator(), [&] {

				            auto& rows = _snp->version()->partition().mutable_clustered_rows();

				            rows_entry::tri_compare cmp(table_schema());

				            // FIXME: Avoid the copy by inserting an incomplete clustering row

				            auto e = alloc_strategy_unique_ptr<rows_entry>(

				                current_allocator().construct<rows_entry>(table_schema(), *_last_row));

				            e->set_continuous(false);

				            auto insert_result = rows.insert_before_hint(rows.end(), std::move(e), cmp);

				            if (insert_result.second) {

				                auto it = insert_result.first;

				                clogger.trace("csm {}: inserted lower bound dummy at {}", fmt::ptr(this), it->position());

				                _snp->tracker()->insert(*it);

				            }

				            _last_row.set_latest(insert_result.first);

				        rows_entry::tri_compare cmp(*_schema);

				        partition_snapshot_row_cursor cur(*_schema, *_snp, false, _read_context.is_reversed());

				        if (!cur.advance_to(_last_row.position())) {

				            return false;

				        }

				        if (cmp(cur.position(), _last_row.position()) != 0) {

				            return false;

				        }

				        auto res = with_allocator(_snp->region().allocator(), [&] {

				            return cur.ensure_entry_in_latest();

				        });

				        _last_row.set_latest(res.it);

				        if (res.inserted) {

				            clogger.trace("csm {}: inserted lower bound dummy at {}", fmt::ptr(this), _last_row.position());

				        }

				    }

				    return true;

				}

				inline

				void cache_flat_mutation_reader::maybe_update_continuity() {

				    if (can_populate() && ensure_population_lower_bound()) {

				    position_in_partition::equal_compare eq(*_schema);

				    if (can_populate()

				            && ensure_population_lower_bound()

				            && !eq(_last_row.position(), _next_row.position())) {

				        with_allocator(_snp->region().allocator(), [&] {

				            rows_entry& e = _next_row.ensure_entry_in_latest().row;

				            auto& rows = _snp->version()->partition().mutable_clustered_rows();

				            const schema& table_s = table_schema();

				            rows_entry::tri_compare table_cmp(table_s);

				            if (_read_context.is_reversed()) [[unlikely]] {

				                clogger.trace("csm {}: set_continuous({})", fmt::ptr(this), _last_row.position());

				                _last_row->set_continuous(true);

				                if (_current_tombstone != _last_row->range_tombstone() && !_last_row->dummy()) {

				                    with_allocator(_snp->region().allocator(), [&] {

				                        auto e2 = alloc_strategy_unique_ptr<rows_entry>(

				                                current_allocator().construct<rows_entry>(table_s,

				                                                                          position_in_partition_view::before_key(_last_row->position()),

				                                                                          is_dummy::yes,

				                                                                          is_continuous::yes));

				                        auto insert_result = rows.insert(std::move(e2), table_cmp);

				                        if (insert_result.second) {

				                            clogger.trace("csm {}: L{}: inserted dummy at {}", fmt::ptr(this), __LINE__, insert_result.first->position());

				                            _snp->tracker()->insert(*insert_result.first);

				                        }

				                        clogger.trace("csm {}: set_continuous({}), prev={}, rt={}", fmt::ptr(this), insert_result.first->position(),

				                                      _last_row.position(), _current_tombstone);

				                        insert_result.first->set_continuous(true);

				                        insert_result.first->set_range_tombstone(_current_tombstone);

				                        clogger.trace("csm {}: set_continuous({})", fmt::ptr(this), _last_row.position());

				                        _last_row->set_continuous(true);

				                    });

				                } else {

				                    clogger.trace("csm {}: set_continuous({}), rt={}", fmt::ptr(this), _last_row.position(), _current_tombstone);

				                    _last_row->set_continuous(true);

				                    _last_row->set_range_tombstone(_current_tombstone);

				                }

				            } else {

				                clogger.trace("csm {}: set_continuous({})", fmt::ptr(this), e.position());

				                e.set_continuous(true);

				                if (_current_tombstone != e.range_tombstone() && !e.dummy()) {

				                    with_allocator(_snp->region().allocator(), [&] {

				                        auto e2 = alloc_strategy_unique_ptr<rows_entry>(

				                                current_allocator().construct<rows_entry>(table_s,

				                                                                          position_in_partition_view::before_key(e.position()),

				                                                                          is_dummy::yes,

				                                                                          is_continuous::yes));

				                        // Use _next_row iterator only as a hint because there could be insertions before

				                        // _next_row.get_iterator_in_latest_version(), either from concurrent reads,

				                        // from _next_row.ensure_entry_in_latest().

				                        auto insert_result = rows.insert_before_hint(_next_row.get_iterator_in_latest_version(), std::move(e2), table_cmp);

				                        if (insert_result.second) {

				                            clogger.trace("csm {}: L{}: inserted dummy at {}", fmt::ptr(this), __LINE__, insert_result.first->position());

				                            _snp->tracker()->insert(*insert_result.first);

				                        }

				                        clogger.trace("csm {}: set_continuous({}), prev={}, rt={}", fmt::ptr(this), insert_result.first->position(),

				                                      _last_row.position(), _current_tombstone);

				                        insert_result.first->set_continuous(true);

				                        insert_result.first->set_range_tombstone(_current_tombstone);

				                        clogger.trace("csm {}: set_continuous({})", fmt::ptr(this), e.position());

				                        e.set_continuous(true);

				                    });

				                } else {

				                    clogger.trace("csm {}: set_continuous({}), rt={}", fmt::ptr(this), e.position(), _current_tombstone);

				                    e.set_range_tombstone(_current_tombstone);

				                    e.set_continuous(true);

				                }

				            }

				            maybe_drop_last_entry();

				            maybe_drop_last_entry(_current_tombstone);

				        });

				    } else {

				        _read_context.cache().on_mispopulate();

				    }

				}

				inline

				void cache_flat_mutation_reader::maybe_add_to_cache(const mutation_fragment_v2& mf) {

				    if (mf.is_range_tombstone_change()) {

				        maybe_add_to_cache(mf.as_range_tombstone_change());

				    } else {

				        assert(mf.is_clustering_row());

				        const clustering_row& cr = mf.as_clustering_row();

				        maybe_add_to_cache(cr);

				    }

				}

				inline

				void cache_flat_mutation_reader::maybe_add_to_cache(const clustering_row& cr) {

				    if (!can_populate()) {

				@@ -572,16 +588,9 @@ void cache_flat_mutation_reader::maybe_add_to_cache(const clustering_row& cr) {

				        _read_context.cache().on_mispopulate();

				        return;

				    }

				    auto rt_opt = _rt_assembler.flush(*_schema, position_in_partition::after_key(cr.key()));

				    clogger.trace("csm {}: populate({})", fmt::ptr(this), clustering_row::printer(*_schema, cr));

				    _lsa_manager.run_in_update_section_with_allocator([this, &cr, &rt_opt] {

				        mutation_partition& mp = _snp->version()->partition();

				        if (rt_opt) {

				            clogger.trace("csm {}: populate flushed rt({})", fmt::ptr(this), *rt_opt);

				            mp.mutable_row_tombstones().apply_monotonically(table_schema(), to_table_domain(range_tombstone(*rt_opt)));

				        }

				    clogger.trace("csm {}: populate({}), rt={}", fmt::ptr(this), clustering_row::printer(*_schema, cr), _current_tombstone);

				    _lsa_manager.run_in_update_section_with_allocator([this, &cr] {

				        mutation_partition_v2& mp = _snp->version()->partition();

				        rows_entry::tri_compare cmp(table_schema());

				        if (_read_context.digest_requested()) {

				@@ -590,6 +599,7 @@ void cache_flat_mutation_reader::maybe_add_to_cache(const clustering_row& cr) {

				        auto new_entry = alloc_strategy_unique_ptr<rows_entry>(

				            current_allocator().construct<rows_entry>(table_schema(), cr.key(), cr.as_deletable_row()));

				        new_entry->set_continuous(false);

				        new_entry->set_range_tombstone(_current_tombstone);

				        auto it = _next_row.iterators_valid() ? _next_row.get_iterator_in_latest_version()

				                                              : mp.clustered_rows().lower_bound(cr.key(), cmp);

				        auto insert_result = mp.mutable_clustered_rows().insert_before_hint(it, std::move(new_entry), cmp);

				@@ -603,9 +613,14 @@ void cache_flat_mutation_reader::maybe_add_to_cache(const clustering_row& cr) {

				            if (_read_context.is_reversed()) [[unlikely]] {

				                clogger.trace("csm {}: set_continuous({})", fmt::ptr(this), _last_row.position());

				                _last_row->set_continuous(true);

				                // _current_tombstone must also apply to _last_row itself (if it's non-dummy)

				                // because otherwise there would be a rtc after it, either creating a different entry,

				                // or clearing _last_row if population did not happen.

				                _last_row->set_range_tombstone(_current_tombstone);

				            } else {

				                clogger.trace("csm {}: set_continuous({})", fmt::ptr(this), e.position());

				                e.set_continuous(true);

				                e.set_range_tombstone(_current_tombstone);

				            }

				        } else {

				            _read_context.cache().on_mispopulate();

				@@ -617,6 +632,72 @@ void cache_flat_mutation_reader::maybe_add_to_cache(const clustering_row& cr) {

				    });

				}

				inline

				bool cache_flat_mutation_reader::maybe_add_to_cache(const range_tombstone_change& rtc) {

				    rows_entry::tri_compare q_cmp(*_schema);

				    clogger.trace("csm {}: maybe_add_to_cache({})", fmt::ptr(this), rtc);

				    // Don't emit the closing range tombstone change, we may continue from cache with the same tombstone.

				    // The following relies on !_underlying_upper_bound->is_clustering_row()

				    if (q_cmp(rtc.position(), *_underlying_upper_bound) == 0) {

				        _lower_bound = rtc.position();

				        return false;

				    }

				    auto prev = std::exchange(_current_tombstone, rtc.tombstone());

				    if (_current_tombstone == prev) {

				        return false;

				    }

				    if (!can_populate()) {

				        // _current_tombstone is now invalid and remains so for this reader. No need to change it.

				        _last_row = nullptr;

				        _population_range_starts_before_all_rows = false;

				        _read_context.cache().on_mispopulate();

				        return true;

				    }

				    _lsa_manager.run_in_update_section_with_allocator([&] {

				        mutation_partition_v2& mp = _snp->version()->partition();

				        rows_entry::tri_compare cmp(table_schema());

				        auto new_entry = alloc_strategy_unique_ptr<rows_entry>(

				                current_allocator().construct<rows_entry>(table_schema(), to_table_domain(rtc.position()), is_dummy::yes, is_continuous::no));

				        auto it = _next_row.iterators_valid() ? _next_row.get_iterator_in_latest_version()

				                                              : mp.clustered_rows().lower_bound(to_table_domain(rtc.position()), cmp);

				        auto insert_result = mp.mutable_clustered_rows().insert_before_hint(it, std::move(new_entry), cmp);

				        it = insert_result.first;

				        if (insert_result.second) {

				            _snp->tracker()->insert(*it);

				        }

				        rows_entry& e = *it;

				        if (ensure_population_lower_bound()) {

				            // underlying may emit range_tombstone_change fragments with the same position.

				            // In such case, the range to which the tombstone from the first fragment applies is empty and should be ignored.

				            if (q_cmp(_last_row.position(), it->position()) < 0) {

				                if (_read_context.is_reversed()) [[unlikely]] {

				                    clogger.trace("csm {}: set_continuous({}), rt={}", fmt::ptr(this), _last_row.position(), prev);

				                    _last_row->set_continuous(true);

				                    _last_row->set_range_tombstone(prev);

				                } else {

				                    clogger.trace("csm {}: set_continuous({}), rt={}", fmt::ptr(this), e.position(), prev);

				                    e.set_continuous(true);

				                    e.set_range_tombstone(prev);

				                }

				            }

				        } else {

				            _read_context.cache().on_mispopulate();

				        }

				        with_allocator(standard_allocator(), [&] {

				            _last_row = partition_snapshot_row_weakref(*_snp, it, true);

				        });

				        _population_range_starts_before_all_rows = false;

				    });

				    return true;

				}

				inline

				bool cache_flat_mutation_reader::after_current_range(position_in_partition_view p) {

				    position_in_partition::tri_compare cmp(*_schema);

				@@ -632,19 +713,35 @@ void cache_flat_mutation_reader::start_reading_from_underlying() {

				inline

				void cache_flat_mutation_reader::copy_from_cache_to_buffer() {

				    clogger.trace("csm {}: copy_from_cache, next={}, next_row_in_range={}", fmt::ptr(this), _next_row.position(), _next_row_in_range);

				    clogger.trace("csm {}: copy_from_cache, next_row_in_range={}, next={}", fmt::ptr(this), _next_row_in_range, _next_row);

				    _next_row.touch();

				    position_in_partition_view next_lower_bound = _next_row.dummy() ? _next_row.position() : position_in_partition_view::after_key(_next_row.key());

				    auto upper_bound = _next_row_in_range ? next_lower_bound : _upper_bound;

				    if (_snp->range_tombstones(_lower_bound, upper_bound, [&] (range_tombstone rts) {

				        add_range_tombstone_to_buffer(std::move(rts));

				        return stop_iteration(_lower_bound_changed && is_buffer_full());

				    }, _read_context.is_reversed()) == stop_iteration::no) {

				        return;

				    if (_next_row.range_tombstone() != _current_tombstone) {

				        position_in_partition::equal_compare eq(*_schema);

				        auto upper_bound = _next_row_in_range ? position_in_partition_view::before_key(_next_row.position()) : _upper_bound;

				        if (!eq(_lower_bound, upper_bound)) {

				            position_in_partition new_lower_bound(upper_bound);

				            auto tomb = _next_row.range_tombstone();

				            clogger.trace("csm {}: rtc({}, {}) ...{}", fmt::ptr(this), _lower_bound, tomb, new_lower_bound);

				            push_mutation_fragment(mutation_fragment_v2(*_schema, _permit, range_tombstone_change(_lower_bound, tomb)));

				            _current_tombstone = tomb;

				            _lower_bound = std::move(new_lower_bound);

				            _read_context.cache()._tracker.on_range_tombstone_read();

				        }

				    }

				    // We add the row to the buffer even when it's full.

				    // This simplifies the code. For more info see #3139.

				    if (_next_row_in_range) {

				        if (_next_row.range_tombstone_for_row() != _current_tombstone) [[unlikely]] {

				            auto tomb = _next_row.range_tombstone_for_row();

				            auto new_lower_bound = position_in_partition::before_key(_next_row.position());

				            clogger.trace("csm {}: rtc({}, {})", fmt::ptr(this), new_lower_bound, tomb);

				            push_mutation_fragment(mutation_fragment_v2(*_schema, _permit, range_tombstone_change(new_lower_bound, tomb)));

				            _lower_bound = std::move(new_lower_bound);

				            _current_tombstone = tomb;

				            _read_context.cache()._tracker.on_range_tombstone_read();

				        }

				        add_to_buffer(_next_row);

				        move_to_next_entry();

				    } else {

				@@ -660,10 +757,11 @@ void cache_flat_mutation_reader::move_to_end() {

				inline

				void cache_flat_mutation_reader::move_to_next_range() {

				    if (_queued_underlying_fragment) {

				        add_to_buffer(*std::exchange(_queued_underlying_fragment, {}));

				    if (_current_tombstone) {

				        clogger.trace("csm {}: move_to_next_range: emit rtc({}, null)", fmt::ptr(this), _upper_bound);

				        push_mutation_fragment(mutation_fragment_v2(*_schema, _permit, range_tombstone_change(_upper_bound, {})));

				        _current_tombstone = {};

				    }

				    flush_tombstones(position_in_partition::for_range_end(*_ck_ranges_curr), true);

				    auto next_it = std::next(_ck_ranges_curr);

				    if (next_it == _ck_ranges_end) {

				        move_to_end();

				@@ -680,8 +778,6 @@ void cache_flat_mutation_reader::move_to_range(query::clustering_row_ranges::con

				    _last_row = nullptr;

				    _lower_bound = std::move(lb);

				    _upper_bound = std::move(ub);

				    _rt_gen.trim(_lower_bound);

				    _lower_bound_changed = true;

				    _ck_ranges_curr = next_it;

				    auto adjacent = _next_row.advance_to(_lower_bound);

				    _next_row_in_range = !after_current_range(_next_row.position());

				@@ -722,7 +818,7 @@ void cache_flat_mutation_reader::move_to_range(query::clustering_row_ranges::con

				// _next_row must have a greater position than _last_row.

				// Invalidates references but keeps the _next_row valid.

				inline

				void cache_flat_mutation_reader::maybe_drop_last_entry() noexcept {

				void cache_flat_mutation_reader::maybe_drop_last_entry(tombstone rt) noexcept {

				    // Drop dummy entry if it falls inside a continuous range.

				    // This prevents unnecessary dummy entries from accumulating in cache and slowing down scans.

				    //

				@@ -733,9 +829,12 @@ void cache_flat_mutation_reader::maybe_drop_last_entry() noexcept {

				            && !_read_context.is_reversed() // FIXME

				            && _last_row->dummy()

				            && _last_row->continuous()

				            && _last_row->range_tombstone() == rt

				            && _snp->at_latest_version()

				            && _snp->at_oldest_version()) {

				        clogger.trace("csm {}: dropping unnecessary dummy at {}", fmt::ptr(this), _last_row->position());

				        with_allocator(_snp->region().allocator(), [&] {

				            cache_tracker& tracker = _read_context.cache()._tracker;

				            tracker.get_lru().remove(*_last_row);

				@@ -769,57 +868,38 @@ void cache_flat_mutation_reader::move_to_next_entry() {

				        if (!_next_row.continuous()) {

				            start_reading_from_underlying();

				        } else {

				            maybe_drop_last_entry();

				            maybe_drop_last_entry(_next_row.range_tombstone());

				        }

				    }

				}

				void cache_flat_mutation_reader::flush_tombstones(position_in_partition_view pos, bool end_of_range) {

				    // Ensure position is appropriate for range tombstone bound

				    pos = position_in_partition_view::after_key(pos);

				    clogger.trace("csm {}: flush_tombstones({}) end_of_range: {}", fmt::ptr(this), pos, end_of_range);

				    _rt_gen.flush(pos, [this] (range_tombstone_change&& rtc) {

				        add_to_buffer(std::move(rtc), source::cache);

				    }, end_of_range);

				    if (auto rtc_opt = _rt_merger.flush(pos, end_of_range)) {

				        do_add_to_buffer(std::move(*rtc_opt));

				    }

				}

				inline

				void cache_flat_mutation_reader::add_to_buffer(mutation_fragment_v2&& mf) {

				    clogger.trace("csm {}: add_to_buffer({})", fmt::ptr(this), mutation_fragment_v2::printer(*_schema, mf));

				    position_in_partition::less_compare less(*_schema);

				    if (_underlying_upper_bound && less(*_underlying_upper_bound, mf.position())) {

				        _queued_underlying_fragment = std::move(mf);

				        return;

				    }

				    flush_tombstones(mf.position());

				void cache_flat_mutation_reader::offer_from_underlying(mutation_fragment_v2&& mf) {

				    clogger.trace("csm {}: offer_from_underlying({})", fmt::ptr(this), mutation_fragment_v2::printer(*_schema, mf));

				    if (mf.is_clustering_row()) {

				        maybe_add_to_cache(mf.as_clustering_row());

				        add_clustering_row_to_buffer(std::move(mf));

				    } else {

				        assert(mf.is_range_tombstone_change());

				        add_to_buffer(std::move(mf).as_range_tombstone_change(), source::underlying);

				        auto& chg = mf.as_range_tombstone_change();

				        if (maybe_add_to_cache(chg)) {

				            add_to_buffer(std::move(mf).as_range_tombstone_change());

				        }

				    }

				}

				inline

				void cache_flat_mutation_reader::add_to_buffer(const partition_snapshot_row_cursor& row) {

				    position_in_partition::less_compare less(*_schema);

				    if (_queued_underlying_fragment && less(_queued_underlying_fragment->position(), row.position())) {

				        add_to_buffer(*std::exchange(_queued_underlying_fragment, {}));

				    }

				    if (!row.dummy()) {

				        _read_context.cache().on_row_hit();

				        if (_read_context.digest_requested()) {

				            row.latest_row().cells().prepare_hash(table_schema(), column_kind::regular_column);

				        }

				        flush_tombstones(position_in_partition_view::for_key(row.key()));

				        add_clustering_row_to_buffer(mutation_fragment_v2(*_schema, _permit, row.row()));

				    } else {

				        if (less(_lower_bound, row.position())) {

				            _lower_bound = row.position();

				            _lower_bound_changed = true;

				        }

				        _read_context.cache()._tracker.on_dummy_row_hit();

				    }

				@@ -832,67 +912,24 @@ inline

				void cache_flat_mutation_reader::add_clustering_row_to_buffer(mutation_fragment_v2&& mf) {

				    clogger.trace("csm {}: add_clustering_row_to_buffer({})", fmt::ptr(this), mutation_fragment_v2::printer(*_schema, mf));

				    auto& row = mf.as_clustering_row();

				    auto new_lower_bound = position_in_partition::after_key(row.key());

				    auto new_lower_bound = position_in_partition::after_key(*_schema, row.key());

				    push_mutation_fragment(std::move(mf));

				    _lower_bound = std::move(new_lower_bound);

				    _lower_bound_changed = true;

				    if (row.tomb()) {

				        _read_context.cache()._tracker.on_row_tombstone_read();

				    }

				}

				inline

				void cache_flat_mutation_reader::add_to_buffer(range_tombstone_change&& rtc, source src) {

				void cache_flat_mutation_reader::add_to_buffer(range_tombstone_change&& rtc) {

				    clogger.trace("csm {}: add_to_buffer({})", fmt::ptr(this), rtc);

				    if (auto rtc_opt = _rt_merger.apply(src, std::move(rtc))) {

				        do_add_to_buffer(std::move(*rtc_opt));

				    }

				}

				inline

				void cache_flat_mutation_reader::do_add_to_buffer(range_tombstone_change&& rtc) {

				    clogger.trace("csm {}: push({})", fmt::ptr(this), rtc);

				    _has_rt = true;

				    position_in_partition::less_compare less(*_schema);

				    auto lower_bound_changed = less(_lower_bound, rtc.position());

				    _lower_bound = position_in_partition(rtc.position());

				    _lower_bound_changed = lower_bound_changed;

				    push_mutation_fragment(*_schema, _permit, std::move(rtc));

				    _read_context.cache()._tracker.on_range_tombstone_read();

				}

				inline

				void cache_flat_mutation_reader::add_range_tombstone_to_buffer(range_tombstone&& rt) {

				    position_in_partition::less_compare less(*_schema);

				    if (_queued_underlying_fragment && less(_queued_underlying_fragment->position(), rt.position())) {

				        add_to_buffer(*std::exchange(_queued_underlying_fragment, {}));

				    }

				    clogger.trace("csm {}: add_to_buffer({})", fmt::ptr(this), rt);

				    if (!less(_lower_bound, rt.position())) {

				        rt.set_start(_lower_bound);

				    }

				    flush_tombstones(rt.position());

				    _rt_gen.consume(std::move(rt));

				}

				inline

				void cache_flat_mutation_reader::maybe_add_to_cache(const range_tombstone_change& rtc) {

				    clogger.trace("csm {}: maybe_add_to_cache({})", fmt::ptr(this), rtc);

				    auto rt_opt = _rt_assembler.consume(*_schema, range_tombstone_change(rtc));

				    if (!rt_opt) {

				        return;

				    }

				    const auto& rt = *rt_opt;

				    if (can_populate()) {

				        clogger.trace("csm {}: maybe_add_to_cache({})", fmt::ptr(this), rt);

				        _lsa_manager.run_in_update_section_with_allocator([&] {

				            _snp->version()->partition().mutable_row_tombstones().apply_monotonically(

				                    table_schema(), to_table_domain(rt));

				        });

				    } else {

				        _read_context.cache().on_mispopulate();

				    }

				}

				inline

				void cache_flat_mutation_reader::maybe_add_to_cache(const static_row& sr) {

				    if (can_populate()) {

									
										17

cdc/CMakeLists.txt
									
										Normal file
									
												View File
												
				@@ -0,0 +1,17 @@

				add_library(cdc STATIC)

				target_sources(cdc

				  PRIVATE

				    cdc_partitioner.cc

				    generation.cc

				    log.cc

				    metadata.cc

				    split.cc)

				target_include_directories(cdc

				  PUBLIC

				    ${CMAKE_SOURCE_DIR})

				target_link_libraries(cdc

				  PUBLIC

				    Seastar::seastar

				    xxHash::xxhash

				  PRIVATE

				    replica)

									
										2

cdc/cdc_extension.hh
									
												View File
												
				@@ -15,7 +15,7 @@

				#include "serializer.hh"

				#include "db/extensions.hh"

				#include "cdc/cdc_options.hh"

				#include "schema.hh"

				#include "schema/schema.hh"

				#include "serializer_impl.hh"

				namespace cdc {

									
										2

cdc/cdc_partitioner.cc
									
												View File
												
				@@ -8,7 +8,7 @@

				#include "cdc_partitioner.hh"

				#include "dht/token.hh"

				#include "schema.hh"

				#include "schema/schema.hh"

				#include "sstables/key.hh"

				#include "utils/class_registrator.hh"

				#include "cdc/generation.hh"

									
										2

cdc/change_visitor.hh
									
												View File
												
				@@ -8,7 +8,7 @@

				#pragma once

				#include "mutation.hh"

				#include "mutation/mutation.hh"

				/*

				 * This file contains a general abstraction for walking over mutations,

Compare commits

2800 Commits mykaul-doc ... br-next

24 .github/CODEOWNERS vendored Unescape Escape View File

17 .github/workflows/docs-amplify-enhanced.yaml vendored Normal file Unescape Escape View File

13 .github/workflows/docs-pages.yaml vendored Unescape Escape View File

8 .github/workflows/docs-pr.yaml vendored Unescape Escape View File

1 .gitignore vendored Unescape Escape View File

9 .gitmodules vendored Unescape Escape View File

883 CMakeLists.txt Unescape Escape View File

2 CONTRIBUTING.md Unescape Escape View File

2 HACKING.md Unescape Escape View File

12 README.md Unescape Escape View File

4 SCYLLA-VERSION-GEN Unescape Escape View File

1 abseil

30 alternator/CMakeLists.txt Normal file Unescape Escape View File

97 alternator/auth.cc Unescape Escape View File

6 alternator/auth.hh Unescape Escape View File

41 alternator/conditions.cc Unescape Escape View File

2 alternator/conditions.hh Unescape Escape View File

4 alternator/error.hh Unescape Escape View File

89 alternator/executor.cc Unescape Escape View File

11 alternator/executor.hh Unescape Escape View File

3 alternator/expressions.cc Unescape Escape View File

2 alternator/expressions_types.hh Unescape Escape View File

32 alternator/serialization.cc Unescape Escape View File

9 alternator/serialization.hh Unescape Escape View File

14 alternator/server.cc Unescape Escape View File

10 alternator/server.hh Unescape Escape View File

60 alternator/streams.cc Unescape Escape View File

74 alternator/ttl.cc Unescape Escape View File

15 amplify.yml Normal file Unescape Escape View File

70 api/CMakeLists.txt Normal file Unescape Escape View File

4 api/api-doc/storage_service.json Unescape Escape View File

86 api/api-doc/task_manager.json Unescape Escape View File

34 api/api-doc/task_manager_test.json Unescape Escape View File

27 api/api.cc Unescape Escape View File

14 api/api.hh Unescape Escape View File

11 api/api_init.hh Unescape Escape View File

3 api/authorization_cache.cc Unescape Escape View File

4 api/authorization_cache.hh Unescape Escape View File

85 api/cache_service.cc Unescape Escape View File

2 api/cache_service.hh Unescape Escape View File

2 api/collectd.cc Unescape Escape View File

2 api/collectd.hh Unescape Escape View File

352 api/column_family.cc Unescape Escape View File

7 api/column_family.hh Unescape Escape View File

1 api/commitlog.cc Unescape Escape View File

2 api/commitlog.hh Unescape Escape View File

46 api/compaction_manager.cc Unescape Escape View File

2 api/compaction_manager.hh Unescape Escape View File

1 api/config.cc Unescape Escape View File

2 api/config.hh Unescape Escape View File

9 api/endpoint_snitch.cc Unescape Escape View File

4 api/endpoint_snitch.hh Unescape Escape View File

1 api/error_injection.cc Unescape Escape View File

2 api/error_injection.hh Unescape Escape View File

27 api/failure_detector.cc Unescape Escape View File

2 api/failure_detector.hh Unescape Escape View File

25 api/gossiper.cc Unescape Escape View File

2 api/gossiper.hh Unescape Escape View File

17 api/hinted_handoff.cc Unescape Escape View File

4 api/hinted_handoff.hh Unescape Escape View File

1 api/lsa.cc Unescape Escape View File

2 api/lsa.hh Unescape Escape View File

3 api/messaging_service.cc Unescape Escape View File

4 api/messaging_service.hh Unescape Escape View File

196 api/storage_proxy.cc Unescape Escape View File

3 api/storage_proxy.hh Unescape Escape View File

467 api/storage_service.cc View File

56 api/storage_service.hh Unescape Escape View File

7 api/stream_manager.cc Unescape Escape View File

4 api/stream_manager.hh Unescape Escape View File

1 api/system.cc Unescape Escape View File

2 api/system.hh Unescape Escape View File

103 api/task_manager.cc Unescape Escape View File

3 api/task_manager.hh Unescape Escape View File

25 api/task_manager_test.cc Unescape Escape View File

3 api/task_manager_test.hh Unescape Escape View File

35 auth/CMakeLists.txt Normal file Unescape Escape View File

12 auth/authenticated_user.cc Unescape Escape View File

2800 Commits

mykaul-doc ... br-next

24

.github/CODEOWNERS vendored

View File

17

.github/workflows/docs-amplify-enhanced.yaml vendored Normal file

View File

13

.github/workflows/docs-pages.yaml vendored

View File

8

.github/workflows/docs-pr.yaml vendored

View File

1

.gitignore vendored

View File

9

.gitmodules vendored

View File

883

CMakeLists.txt

View File

2

CONTRIBUTING.md

View File

2

HACKING.md

View File

12

README.md

View File

4

SCYLLA-VERSION-GEN

View File

1

abseil

30

alternator/CMakeLists.txt Normal file

View File

97

alternator/auth.cc

View File

6

alternator/auth.hh

View File

41

alternator/conditions.cc

View File

2

alternator/conditions.hh

View File

4

alternator/error.hh

View File

89

alternator/executor.cc

View File

11

alternator/executor.hh

View File

3

alternator/expressions.cc

View File

2

alternator/expressions_types.hh

View File

32

alternator/serialization.cc

View File

9

alternator/serialization.hh

View File

14

alternator/server.cc

View File

10

alternator/server.hh

View File

60

alternator/streams.cc

View File

74

alternator/ttl.cc

View File

15

amplify.yml Normal file

View File

70

api/CMakeLists.txt Normal file

View File

4

api/api-doc/storage_service.json

View File

86

api/api-doc/task_manager.json

View File

34

api/api-doc/task_manager_test.json

View File

27

api/api.cc

View File

14

api/api.hh

View File

11

api/api_init.hh

View File

3

api/authorization_cache.cc

View File

4

api/authorization_cache.hh

View File

85

api/cache_service.cc

View File

2

api/cache_service.hh

View File

2

api/collectd.cc

View File

2

api/collectd.hh

View File

352

api/column_family.cc

View File

7

api/column_family.hh

View File

1

api/commitlog.cc

View File

2

api/commitlog.hh

View File

46

api/compaction_manager.cc

View File

2

api/compaction_manager.hh

View File

1

api/config.cc

View File

2

api/config.hh

View File

9

api/endpoint_snitch.cc

View File

4

api/endpoint_snitch.hh

View File

1

api/error_injection.cc

View File

2

api/error_injection.hh

View File

27

api/failure_detector.cc

View File

2

api/failure_detector.hh

View File

25

api/gossiper.cc

View File

2

api/gossiper.hh

View File

17

api/hinted_handoff.cc

View File

4

api/hinted_handoff.hh

View File

1

api/lsa.cc

View File

2

api/lsa.hh

View File

3

api/messaging_service.cc

View File

4

api/messaging_service.hh

View File

196

api/storage_proxy.cc

View File

3

api/storage_proxy.hh

View File

467

api/storage_service.cc

View File

56

api/storage_service.hh

View File

7

api/stream_manager.cc

View File

4

api/stream_manager.hh

View File

1

api/system.cc

View File

2

api/system.hh

View File

103

api/task_manager.cc

View File

3

api/task_manager.hh

View File

25

api/task_manager_test.cc

View File

3

api/task_manager_test.hh

View File

35

auth/CMakeLists.txt Normal file

View File

12

auth/authenticated_user.cc

View File

21

auth/authenticated_user.hh

View File