scylladb

Author	SHA1	Message	Date
Raphael S. Carvalho	d79fb9a12f	docs: Update compaction controller doc The doc is being updated to reflect the changes in the commit `d8833de3bb` ("Redefine Compaction Backlog to tame compaction aggressiveness"). Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-04-26 10:50:45 +03:00
Piotr Sarna	20de52d96c	docs: add a paragraph on keyspace storage options A new CQL extension: allowing to specify keyspace storage options, is now described in our design notes.	2022-04-08 09:17:01 +02:00
Wojciech Mitros	8a9d55d3a1	wasm: add wasm ABI version 2 Because the only available version of wasm ABI did not allow freeing any allocated memory, a new version of the ABI is introduced. In this version, the host is required to export _scylla_malloc and _scylla_free methods, which are later used for the memory management. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>	2022-03-30 20:49:35 +02:00
Wojciech Mitros	1f81e05d52	wasm: add documentation The ABI of wasm UDFs changed since the last time the documentation was written, so it's being update in this patch. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>	2022-03-30 19:44:30 +02:00
Nadav Har'El	f76f6dbccb	secondary index: avoid special characters in default index names In CQL, table names are limited to so-called word characters (letters, numbers and underscores), but column names don't have such a limitation. When we create a secondary index, its default name is constructed from the column name - so can contain problematic characters. It can include even the "/" character. The problem is that the index name is then used, like a table name, to create a directory with that name. The test included in this patch demonstrates that before this patch, this can be misused to create subdirectories anywhere in the filesystem, or to crash Scylla when it fails to create a directory (which it considers an unrecoverable I/O error). In this patch we do what Cassandra does - remove all non-word characters from the indexed column name before constructing the default index name. In the included test - which can run on both Scylla and Cassandra - we verify that the constructed index name is the same as in Cassandra, which is useful to know (e.g., because knowing the index name is needed to DROP the index). Also, this patch adds a second line of defense against the security problem described above: It is now an error to create a schema with a slash or null (the two characters not allowed in Unix filenames) in the keyspace or table names. So if the first line of defense (CQL checking the validity of its commands) fails, we'll have that second line of defense. I verified that if I revert the default-index-name fix, the second line of defense kicks in, and the index creation is aborted and cannot create files in the wrong place to crash Scylla. Fixes #3403 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220320162543.3091121-1-nyh@scylladb.com>	2022-03-20 18:33:48 +02:00
Pavel Emelyanov	d586805054	docs: Add system.clients description There's a document that sums up the tables from system keyspace and its missing the clients table. This set is going to reimplement the table keeping the schema intact, so it's good time to document it right at the beginning. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-02-18 14:25:07 +03:00
Michał Sala	4903f7a314	docs: add parallel aggregations design doc Added document describes the design of a mechanism that parallelizes execution of aggregation queries.	2022-02-02 17:52:22 +01:00
Botond Dénes	8ac7c4f523	docs/design-notes/IDL.md: fix typo: s/on only/only/ Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20220118094416.242409-1-bdenes@scylladb.com>	2022-01-18 12:30:39 +02:00
Gleb Natapov	dc886d96d1	idl-compiler: update the documentation with new features added recently The series to move storage_proxy verbs to the IDL added not features to the IDL compiler, but was lacking a documentation. This patch documents the features.	2022-01-16 15:12:07 +02:00
Piotr Sarna	a36c8990ab	docs: move service_levels.md to design-notes Along the way, our flat structure for docs was changed to categorize the documents, but service_levels.md was forward-ported later and missed the created directory structure, so it was created as a sole document in the top directory. Move it to where the other similar docs live. Message-Id: <68079d9dd511574ee32fce15fec541ca75fca1e2.1640248754.git.sarna@scylladb.com>	2021-12-26 14:10:52 +02:00
Piotr Sarna	483a98aa14	docs: add AssemblyScript example to wasm.md The paragraph about WebAssembly missed a very useful language, AssemblyScript. An example for it is provided in this patch. Message-Id: <8d6ea1038f2944917316de29c7ca5cce88b2a148.1640248754.git.sarna@scylladb.com>	2021-12-26 14:10:52 +02:00
Tzach Livyatan	d6fbabbf8c	fix typo in repair_based_node_ops.md Fix https://github.com/scylladb/scylla/issues/9786 Closes #9788	2021-12-15 09:56:21 +02:00
Nadav Har'El	f9673309aa	docs: protocols.md - add information on Redis listening address The description in protocols.md of the Redis protocol server in Scylla explains how its port can be configured, but not how the listening IP address can be configured. It turns out that the same "rpc_address" that controls CQL's and Thrift's IP address also applies to Redis. So let's document that. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20211208160206.1290916-1-nyh@scylladb.com>	2021-12-08 20:14:52 +01:00
Gavin Howell	c6e0a807b4	Update wasm.md Grammar correction, sentence re-write. Closes #9760	2021-12-08 10:24:53 +01:00
David Garcia	954d5d5d63	Fix cql docs error Closes #9613	2021-12-02 09:58:58 +02:00
GavinJE	22fa7ecf99	Update compaction_controller.md Line 15. "ee" changed to "they" Closes #9651	2021-11-19 14:19:20 +03:00
Michael Livshin	a7511cf600	system keyspace: record partitions with too many rows Add "rows" field to system.large_partitions. Add partitions to the table when they are too large or have too many rows. Fixes #9506 Signed-off-by: Michael Livshin <michael.livshin@scylladb.com> Closes #9577	2021-11-14 14:25:18 +02:00
Pavel Emelyanov	4a70e0aa57	system_keyspace: Table with config options A config option value is reported as 'text' type and contains a string as it would looks like in json config. The table is UPDATE-able. Only the 'value' columnt can be set and the value accepted must be string. It will be converted into the option type automatically, however in current implementation is't not 100% precise -- conversion is lexicographical cast which only works for simple types. However, liveupdate-able values are only of those types, so it works in supported cases. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-11-11 16:39:34 +03:00
Botond Dénes	d51aa66a8a	db/system_keyspace: add versions table Contains all version related information (`nodetool version` and more). Example printout: (cqlsh) select * from system.versions; key \| build_id \| build_mode \| version -------+------------------------------------------+------------+------------------------------- local \| aaecce2f5068b0160efd04a09b0e28e100b9cd9e \| dev \| 4.6.dev-0.20211021.0d744fd3fa	2021-11-05 15:42:42 +02:00
Botond Dénes	89cc016f07	db/system_keyspace: add runtime_info table Loosly contains the equivalent of the `nodetool info` command, with some notable differences: * Protocol server related information is in `system.protocol_servers`; * Information about memory, memtable and cache is reformatted to be tailored to scylla: C* specific terminology and metrics are dropped; * Information that doesn't change and is already in `system.local` is not contained; * Added trace-probability too (`nodetool gettraceprobability`); TODO(follow-up): exceptions.	2021-11-05 15:42:42 +02:00
Botond Dénes	78adda197f	db/system_keyspace: add protocol_servers table Lists all the client protocol server and their status. Example output: (cqlsh) select * from system.protocol_servers; name \| is_running \| listen_addresses \| protocol \| protocol_version ------------------+------------+---------------------------------------+----------+------------------ native transport \| True \| ['127.0.0.1:9042', '127.0.0.1:19042'] \| cql \| 3.3.1 alternator \| False \| [] \| dynamodb \| rpc \| False \| [] \| thrift \| 20.1.0 redis \| False \| [] \| redis \| This prints the equivalent of `nodetool statusbinary` and the "Thrift active" and "Native Transport active" fields from the `nodetool info` output with some additional information: * It contains alternator and redis status; * It contains the protocol version; * It contains the listen addresses (if respective server is running);	2021-11-05 15:42:42 +02:00
Botond Dénes	64f658aea4	db/system_keyspace: add snapshots virtual table Lists the equivalent of the `nodetool listsnapshots` command.	2021-11-05 15:42:41 +02:00
Botond Dénes	185c5f1f5b	docs/design-notes/system_keyspace.md: add listing of existing virtual tables As well as a link to the newly added docs/guides/virtual-tables.md	2021-11-05 15:42:39 +02:00
garanews	7a6a59eb7c	fix some typo in docs Closes #9510	2021-11-02 19:59:16 +03:00
Avi Kivity	0ea79559a6	Merge 'IDL: support generating boilerplate code for RPC verbs' from Pavel Solodovnikov Introduce new syntax in IDL compiler to allow generating registration/sending code for RPC verbs: ``` verb [[attr1, attr2...] my_verb (args...) -> return_type; ``` `my_verb` RPC verb declaration corresponds to the `netw::messaging_verb::MY_VERB` enumeration value to identify the new RPC verb. For a given `idl_module.idl.hh` file, a registrator class named `idl_module_rpc_verbs` will be created if there are any RPC verbs registered within the IDL module file. These are the methods being created for each RPC verb: ``` static void register_my_verb(netw::messaging_service* ms, std::function<return_type(args...)>&&); static future<> unregister_my_verb(netw::messaging_service* ms); static future<> send_my_verb(netw::messaging_service* ms, netw::msg_addr id, args...); ``` Each method accepts a pointer to an instance of `messaging_service` object, which contains the underlying seastar RPC protocol implementation, that is used to register verbs and pass messages. There is also a method to unregister all verbs at once: ``` static future<> unregister(netw::messaging_service* ms); ``` The following attributes are supported when declaring an RPC verb in the IDL: * `[[with_client_info]]` - the handler will contain a const reference to an `rpc::client_info` as the first argument. * `[[with_timeout]]` - an additional `time_point` parameter is supplied to the handler function and `send` method uses `send_message__timeout` variant of internal function to actually send the message. * `[[one_way]]` - the handler function is annotated by `future<rpc::no_wait_type>` return type to designate that a client doesn't need to wait for an answer. The `-> return_type` clause is optional for two-way messages. If omitted, the return type is set to be `future<>`. For one-way verbs, the use of return clause is prohibited and the signature of `send` function always returns `future<>`. No existing code is affected. Ref: #1456 Closes #9359 github.com:scylladb/scylla: idl: support generating boilerplate code for RPC verbs idl: allow specifying multiple attributes in the grammar message: messaging_service: extract RPC protocol details and helpers into a separate header	2021-10-05 18:05:24 +03:00
Pavel Solodovnikov	88f9f2e9d0	idl: support generating boilerplate code for RPC verbs Introduce new syntax in IDL compiler to allow generating registration/sending code for RPC verbs: verb [[attr1, attr2...] my_verb (args...) -> return_type; `my_verb` RPC verb declaration corresponds to the `netw::messaging_verb::MY_VERB` enumeration value to identify the new RPC verb. For a given `idl_module.idl.hh` file, a registrator class named `idl_module_rpc_verbs` will be created if there are any RPC verbs registered within the IDL module file. These are the methods being created for each RPC verb: static void register_my_verb(netw::messaging_service* ms, std::function<return_type(args...)>&&); static future<> unregister_my_verb(netw::messaging_service* ms); static future<> send_my_verb(netw::messaging_service* ms, netw::msg_addr id, args...); Each method accepts a pointer to an instance of `messaging_service` object, which contains the underlying seastar RPC protocol implementation, that is used to register verbs and pass messages. There is also a method to unregister all verbs at once: static future<> unregister(netw::messaging_service* ms); The following attributes are supported when declaring an RPC verb in the IDL: * [[with_client_info]] - the handler will contain a const reference to an `rpc::client_info` as the first argument. * [[with_timeout]] - an additional `time_point` parameter is supplied to the handler function and `send` method uses `send_message__timeout` variant of internal function to actually send the message. * [[one_way]] - the handler function is annotated by `future<rpc::no_wait_type>` return type to designate that a client doesn't need to wait for an answer. The `-> return_type` clause is optional for two-way messages. If omitted, the return type is set to be `future<>`. For one-way verbs, the use of return clause is prohibited and the signature of `send*` function always returns `future<>`. No existing code is affected. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-09-30 02:21:57 +03:00
Piotr Sarna	6c4a71cdea	docs: add a WebAssembly entry The doc briefly describes the state of WASM support for user-defined functions.	2021-09-13 19:03:58 +02:00
Botond Dénes	0cc00b5d17	docs: design-notes: add reverse-reads.md Explaining how reverse reads work, in particular the difference between the legacy and native formats.	2021-09-09 11:49:02 +03:00
Benny Halevy	df442d4d24	messaging_service: never listen on port 0 We never want to listen on port 0, even if configured so. When the listen port is set to 0, the OS will choose the port randomly, which makes it useless for communicating with other nodes in the cluster, since we don't support that. Also, it causes the listen_ports_conf_test internode_ssl_test to fail since it expects to disable listening on storage_port or ssl_storage_port when set to 0, as seen in https://github.com/scylladb/scylla-dtest/issues/2174. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-06-30 16:24:54 +03:00
Kamil Braun	739c24b020	docs: update cdc.md with info about the new internal table	2021-05-25 16:07:23 +02:00
Asias He	b6104e5f44	doc: Update bootstrap with everywhere_topology Document how we choose node to sync with if everywhere_topology is used. Refs #8503 Closes #8518	2021-04-22 11:24:49 +03:00
Kamil Braun	e486e0f759	tree-wide: rename "cdc streams timestamp" to "cdc generation id" Each CDC generation always has a timestamp, but the fact that the timestamp identifies the generation is an implementation detail. We abstract away from this detail by using a more generic naming scheme: a generation "identifier" (whatever that is - a timestamp or something else). It's possible that a CDC generation will be identified by more than a timestamp in the (near) future. The actual string gossiped by nodes in their application state is left as "CDC_STREAMS_TIMESTAMP" for backward compatibility. Some stale comments have been updated.	2021-04-06 13:15:31 +02:00
Nadav Har'El	ccc75bfe2a	Merge 'Disable thrift by default' from Piotr Sarna The Thrift layer is functional, but it's not usually the first-choice protocol for Scylla users, so it's hereby disabled by default. Fixes #8336 Closes #8338 * github.com:scylladb/scylla: docs: mention disabling Thrift by default db,config: disable Thrift by default	2021-03-29 12:48:20 +03:00
Michał Chojnowski	8c45225f21	docs: remove the obsolete IMR design note IMR, as described in this design note, was removed in `001652815c`. This doc should have been removed back then, but was overlooked. Closes #8340	2021-03-29 10:58:05 +02:00
Piotr Sarna	b774d69ad2	docs: mention disabling Thrift by default Thrift is no longer enabled by default, so the documentation should mention that, as well as the suggested way of enabling it if necessary.	2021-03-22 14:32:51 +01:00
Juliusz Stasiewicz	382545a614	docs: explain SSL/non-SSL and shard-aware CQL ports I added short description of shard-aware ports + updated the rules for disabling ports and enabling SSL introduced by #7992. Fixes #8146 Closes #8152	2021-03-09 22:48:30 +02:00
Calle Wilund	58489dc003	cql3::restrictions: Add SCYLLA_CLUSTERING_BOUND keyword for sstableloader Refs #8093 Refs /scylladb/scylla-tools-java#218 Adds keyword that can preface value tuples in (a, b, c) > (1, 2, 3) expressions, forcing the restriction to bypass column sort order treatment, and instead just create the raw ck bounds accordningly. This is a very limited, and simple version, but since we only need to cover this above exact syntax, this should be sufficient. v2: * Add small cql test v3: * Added comment in multi_column_restriction::slice, on what "mode" means and is for * Added small document of our internal CQL extension keywords, including this. v4: * Added a few more cases to tests to verify multi-column restrictions * Reworded docs a bit v5: * Fixed copy-paste error in comment v6: * Added negative (error) test cases v7: * Added check + reject of trying to combine SCYLLA_CLUST... slice and normal one Closes #8094	2021-03-03 07:06:45 +01:00
Kamil Braun	9bdd000e97	cdc: rewrite streams to the new description table Nodes automatically ensure that the latest CDC generation's list of streams is present in the streams description table. When a new generation appears, we only need to update the table for this generation; old generations are already inserted. However, we've changed the description table (from `cdc_streams_descriptions` to `cdc_streams_descriptions_v2`). The existing mechanism only ensures that the latest generation appears in the new description table. This commit adds an additional procedure that rewrites the older generations as well, if we find that it is necessary to do so (i.e. when some CDC log tables may contain data in these generations).	2021-02-18 11:44:59 +01:00
Kamil Braun	99cc9b8051	docs: cdc: mention system.cdc_local table	2021-02-18 11:44:59 +01:00
Kamil Braun	67d4e5576d	sys_dist_ks: split CDC streams table partitions into clustered rows Until now, the lists of streams in the `cdc_streams_descriptions` table for a given generation were stored in a single collection. This solution has multiple problems when dealing with large clusters (which produce large lists of streams): 1. large allocations 2. reactor stalls 3. mutations too large to even fit in commitlog segments This commit changes the schema of the table as described in issue #7993. The streams are grouped according to token ranges, each token range being represented by a separate clustering row. Rows are inserted in reasonably large batches for efficiency. The table is renamed to enable easy upgrade. On upgrade, the latest CDC generation's list of streams will be (re-)inserted into the new table. Yet another table is added: one that contains only the generation timestamps clustered in a single partition. This makes it easy for CDC clients to learn about new generations. It also enables an elegant two-phase insertion procedure of the generation description: first we insert the streams; only after ensuring that a quorum of replicas contains them, we insert the timestamp. Thus, if any client observes a timestamp in the timestamps table (even using a ONE query), it means that a quorum of replicas must contain the list of streams.	2021-02-18 11:44:59 +01:00
Avi Kivity	913d970c64	Merge "Unify inactive readers" from Botond " Currently inactive readers are stored in two different places: * reader concurrency semaphore * querier cache With the latter registering its inactive readers with the former. This is an unnecessarily complex (and possibly surprising) setup that we want to move away from. This series solves this by moving the responsibility if storing of inactive reads solely to the reader concurrency semaphore, including all supported eviction policies. The querier cache is now only responsible for indexing queriers and maintaining relevant stats. This makes the ownership of the inactive readers much more clear, hopefully making Benny's work on introducing close() and abort() a little bit easier. Tests: unit(release, debug:v1) " * 'unify-inactive-readers/v2' of https://github.com/denesb/scylla: reader_concurrency_semaphore: store inactive readers directly querier_cache: store readers in the reader concurrency semaphore directly querier_cache: retire memory based cache eviction querier_cache: delegate expiry to the reader_concurrency_semaphore reader_concurrency_semaphore: introduce ttl for inactive reads querier_cache: use new eviction notify mechanism to maintain stats reader_concurrency_semaphore: add eviction notification facility reader_concurrency_semaphore: extract evict code into method evict()	2021-02-03 10:59:04 +02:00
dgarcia360	fd5f0c3034	docs: add organization Closes #7818	2020-12-22 15:33:31 +02:00

42 Commits