Commit Graph

42 Commits

Author SHA1 Message Date
Raphael S. Carvalho
d79fb9a12f docs: Update compaction controller doc
The doc is being updated to reflect the changes in the commit
d8833de3bb ("Redefine Compaction Backlog to tame
compaction aggressiveness").

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2022-04-26 10:50:45 +03:00
Piotr Sarna
20de52d96c docs: add a paragraph on keyspace storage options
A new CQL extension: allowing to specify keyspace storage options,
is now described in our design notes.
2022-04-08 09:17:01 +02:00
Wojciech Mitros
8a9d55d3a1 wasm: add wasm ABI version 2
Because the only available version of wasm ABI did not allow
freeing any allocated memory, a new version of the ABI is
introduced. In this version, the host is required to export
_scylla_malloc and _scylla_free methods, which are later used
for the memory management.

Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
2022-03-30 20:49:35 +02:00
Wojciech Mitros
1f81e05d52 wasm: add documentation
The ABI of wasm UDFs changed since the last time the documentation
was written, so it's being update in this patch.

Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
2022-03-30 19:44:30 +02:00
Nadav Har'El
f76f6dbccb secondary index: avoid special characters in default index names
In CQL, table names are limited to so-called word characters (letters,
numbers and underscores), but column names don't have such a limitation.
When we create a secondary index, its default name is constructed from
the column name - so can contain problematic characters. It can include
even the "/" character. The problem is that the index name is then used,
like a table name, to create a directory with that name.

The test included in this patch demonstrates that before this patch, this
can be misused to create subdirectories anywhere in the filesystem, or to
crash Scylla when it fails to create a directory (which it considers an
unrecoverable I/O error).

In this patch we do what Cassandra does - remove all non-word
characters from the indexed column name before constructing the default
index name. In the included test - which can run on both Scylla and
Cassandra - we verify that the constructed index name is the same as
in Cassandra, which is useful to know (e.g., because knowing the index
name is needed to DROP the index).

Also, this patch adds a second line of defense against the security problem
described above: It is now an error to create a schema with a slash or
null (the two characters not allowed in Unix filenames) in the keyspace
or table names. So if the first line of defense (CQL checking the validity
of its commands) fails, we'll have that second line of defense. I verified
that if I revert the default-index-name fix, the second line of defense
kicks in, and the index creation is aborted and cannot create files in
the wrong place to crash Scylla.

Fixes #3403

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220320162543.3091121-1-nyh@scylladb.com>
2022-03-20 18:33:48 +02:00
Pavel Emelyanov
d586805054 docs: Add system.clients description
There's a document that sums up the tables from system keyspace and
its missing the clients table. This set is going to reimplement the
table keeping the schema intact, so it's good time to document it
right at the beginning.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-02-18 14:25:07 +03:00
Michał Sala
4903f7a314 docs: add parallel aggregations design doc
Added document describes the design of a mechanism that parallelizes
execution of aggregation queries.
2022-02-02 17:52:22 +01:00
Botond Dénes
8ac7c4f523 docs/design-notes/IDL.md: fix typo: s/on only/only/
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20220118094416.242409-1-bdenes@scylladb.com>
2022-01-18 12:30:39 +02:00
Gleb Natapov
dc886d96d1 idl-compiler: update the documentation with new features added recently
The series to move storage_proxy verbs to the IDL added not features to
the IDL compiler, but was lacking a documentation. This patch documents
the features.
2022-01-16 15:12:07 +02:00
Piotr Sarna
a36c8990ab docs: move service_levels.md to design-notes
Along the way, our flat structure for docs was changed
to categorize the documents, but service_levels.md was forward-ported
later and missed the created directory structure, so it was created
as a sole document in the top directory. Move it to where the other
similar docs live.

Message-Id: <68079d9dd511574ee32fce15fec541ca75fca1e2.1640248754.git.sarna@scylladb.com>
2021-12-26 14:10:52 +02:00
Piotr Sarna
483a98aa14 docs: add AssemblyScript example to wasm.md
The paragraph about WebAssembly missed a very useful language,
AssemblyScript. An example for it is provided in this patch.

Message-Id: <8d6ea1038f2944917316de29c7ca5cce88b2a148.1640248754.git.sarna@scylladb.com>
2021-12-26 14:10:52 +02:00
Tzach Livyatan
d6fbabbf8c fix typo in repair_based_node_ops.md
Fix https://github.com/scylladb/scylla/issues/9786

Closes #9788
2021-12-15 09:56:21 +02:00
Nadav Har'El
f9673309aa docs: protocols.md - add information on Redis listening address
The description in protocols.md of the Redis protocol server in Scylla
explains how its port can be configured, but not how the listening IP
address can be configured. It turns out that the same "rpc_address" that
controls CQL's and Thrift's IP address also applies to Redis. So let's
document that.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20211208160206.1290916-1-nyh@scylladb.com>
2021-12-08 20:14:52 +01:00
Gavin Howell
c6e0a807b4 Update wasm.md
Grammar correction, sentence re-write.

Closes #9760
2021-12-08 10:24:53 +01:00
David Garcia
954d5d5d63 Fix cql docs error
Closes #9613
2021-12-02 09:58:58 +02:00
GavinJE
22fa7ecf99 Update compaction_controller.md
Line 15.

"ee" changed to "they"

Closes #9651
2021-11-19 14:19:20 +03:00
Michael Livshin
a7511cf600 system keyspace: record partitions with too many rows
Add "rows" field to system.large_partitions.  Add partitions to the
table when they are too large or have too many rows.

Fixes #9506

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>

Closes #9577
2021-11-14 14:25:18 +02:00
Pavel Emelyanov
4a70e0aa57 system_keyspace: Table with config options
A config option value is reported as 'text' type and contains
a string as it would looks like in json config.

The table is UPDATE-able. Only the 'value' columnt can be set
and the value accepted must be string. It will be converted into
the option type automatically, however in current implementation
is't not 100% precise -- conversion is lexicographical cast which
only works for simple types. However, liveupdate-able values are
only of those types, so it works in supported cases.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-11 16:39:34 +03:00
Botond Dénes
d51aa66a8a db/system_keyspace: add versions table
Contains all version related information (`nodetool version` and more).
Example printout:

    (cqlsh) select * from system.versions;

     key   | build_id                                 | build_mode | version
    -------+------------------------------------------+------------+-------------------------------
     local | aaecce2f5068b0160efd04a09b0e28e100b9cd9e |        dev | 4.6.dev-0.20211021.0d744fd3fa
2021-11-05 15:42:42 +02:00
Botond Dénes
89cc016f07 db/system_keyspace: add runtime_info table
Loosly contains the equivalent of the `nodetool info` command, with some
notable differences:
* Protocol server related information is in `system.protocol_servers`;
* Information about memory, memtable and cache is reformatted to be
  tailored to scylla: C* specific terminology and metrics are dropped;
* Information that doesn't change and is already in `system.local` is
  not contained;
* Added trace-probability too (`nodetool gettraceprobability`);

TODO(follow-up): exceptions.
2021-11-05 15:42:42 +02:00
Botond Dénes
78adda197f db/system_keyspace: add protocol_servers table
Lists all the client protocol server and their status. Example output:

    (cqlsh) select * from system.protocol_servers;

      name             | is_running | listen_addresses                      | protocol | protocol_version
    ------------------+------------+---------------------------------------+----------+------------------
     native transport |       True | ['127.0.0.1:9042', '127.0.0.1:19042'] |      cql |            3.3.1
	   alternator |      False |                                    [] | dynamodb |
		  rpc |      False |                                    [] |   thrift |           20.1.0
		redis |      False |                                    [] |    redis |

This prints the equivalent of `nodetool statusbinary` and the "Thrift
active" and "Native Transport active" fields from the `nodetool info`
output with some additional information:
* It contains alternator and redis status;
* It contains the protocol version;
* It contains the listen addresses (if respective server is running);
2021-11-05 15:42:42 +02:00
Botond Dénes
64f658aea4 db/system_keyspace: add snapshots virtual table
Lists the equivalent of the `nodetool listsnapshots` command.
2021-11-05 15:42:41 +02:00
Botond Dénes
185c5f1f5b docs/design-notes/system_keyspace.md: add listing of existing virtual tables
As well as a link to the newly added docs/guides/virtual-tables.md
2021-11-05 15:42:39 +02:00
garanews
7a6a59eb7c fix some typo in docs
Closes #9510
2021-11-02 19:59:16 +03:00
Avi Kivity
0ea79559a6 Merge 'IDL: support generating boilerplate code for RPC verbs' from Pavel Solodovnikov
Introduce new syntax in IDL compiler to allow generating
registration/sending code for RPC verbs:

```
        verb [[attr1, attr2...] my_verb (args...) -> return_type;
```

`my_verb` RPC verb declaration corresponds to the
`netw::messaging_verb::MY_VERB` enumeration value to identify the
new RPC verb.

For a given `idl_module.idl.hh` file, a registrator class named
`idl_module_rpc_verbs` will be created if there are any RPC verbs
registered within the IDL module file.

These are the methods being created for each RPC verb:

```
        static void register_my_verb(netw::messaging_service* ms, std::function<return_type(args...)>&&);
        static future<> unregister_my_verb(netw::messaging_service* ms);
        static future<> send_my_verb(netw::messaging_service* ms, netw::msg_addr id, args...);
```

Each method accepts a pointer to an instance of `messaging_service`
object, which contains the underlying seastar RPC protocol
implementation, that is used to register verbs and pass messages.

There is also a method to unregister all verbs at once:

```
        static future<> unregister(netw::messaging_service* ms);
```

The following attributes are supported when declaring an RPC verb
in the IDL:
* `[[with_client_info]]` - the handler will contain a const reference to
  an `rpc::client_info` as the first argument.
* `[[with_timeout]]` - an additional `time_point` parameter is supplied
  to the handler function and `send*` method uses `send_message_*_timeout`
  variant of internal function to actually send the message.
* `[[one_way]]` - the handler function is annotated by
  `future<rpc::no_wait_type>` return type to designate that a client
  doesn't need to wait for an answer.

The `-> return_type` clause is optional for two-way messages. If omitted,
the return type is set to be `future<>`.
For one-way verbs, the use of return clause is prohibited and the
signature of `send*` function always returns `future<>`.

No existing code is affected.

Ref: #1456

Closes #9359

* github.com:scylladb/scylla:
  idl: support generating boilerplate code for RPC verbs
  idl: allow specifying multiple attributes in the grammar
  message: messaging_service: extract RPC protocol details and helpers into a separate header
2021-10-05 18:05:24 +03:00
Pavel Solodovnikov
88f9f2e9d0 idl: support generating boilerplate code for RPC verbs
Introduce new syntax in IDL compiler to allow generating
registration/sending code for RPC verbs:

        verb [[attr1, attr2...] my_verb (args...) -> return_type;

`my_verb` RPC verb declaration corresponds to the
`netw::messaging_verb::MY_VERB` enumeration value to identify the
new RPC verb.

For a given `idl_module.idl.hh` file, a registrator class named
`idl_module_rpc_verbs` will be created if there are any RPC verbs
registered within the IDL module file.

These are the methods being created for each RPC verb:

        static void register_my_verb(netw::messaging_service* ms, std::function<return_type(args...)>&&);
        static future<> unregister_my_verb(netw::messaging_service* ms);
        static future<> send_my_verb(netw::messaging_service* ms, netw::msg_addr id, args...);

Each method accepts a pointer to an instance of `messaging_service`
object, which contains the underlying seastar RPC protocol
implementation, that is used to register verbs and pass messages.

There is also a method to unregister all verbs at once:

        static future<> unregister(netw::messaging_service* ms);

The following attributes are supported when declaring an RPC verb
in the IDL:
* [[with_client_info]] - the handler will contain a const reference to
  an `rpc::client_info` as the first argument.
* [[with_timeout]] - an additional `time_point` parameter is supplied
  to the handler function and `send*` method uses `send_message_*_timeout`
  variant of internal function to actually send the message.
* [[one_way]] - the handler function is annotated by
  `future<rpc::no_wait_type>` return type to designate that a client
  doesn't need to wait for an answer.

The `-> return_type` clause is optional for two-way messages. If omitted,
the return type is set to be `future<>`.
For one-way verbs, the use of return clause is prohibited and the
signature of `send*` function always returns `future<>`.

No existing code is affected.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2021-09-30 02:21:57 +03:00
Piotr Sarna
6c4a71cdea docs: add a WebAssembly entry
The doc briefly describes the state of WASM support
for user-defined functions.
2021-09-13 19:03:58 +02:00
Botond Dénes
0cc00b5d17 docs: design-notes: add reverse-reads.md
Explaining how reverse reads work, in particular the difference between
the legacy and native formats.
2021-09-09 11:49:02 +03:00
Benny Halevy
df442d4d24 messaging_service: never listen on port 0
We never want to listen on port 0, even if configured so.
When the listen port is set to 0, the OS will choose the
port randomly, which makes it useless for communicating
with other nodes in the cluster, since we don't support that.

Also, it causes the listen_ports_conf_test internode_ssl_test
to fail since it expects to disable listening on storage_port
or ssl_storage_port when set to 0, as seen in
https://github.com/scylladb/scylla-dtest/issues/2174.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-06-30 16:24:54 +03:00
Kamil Braun
739c24b020 docs: update cdc.md with info about the new internal table 2021-05-25 16:07:23 +02:00
Asias He
b6104e5f44 doc: Update bootstrap with everywhere_topology
Document how we choose node to sync with if everywhere_topology is used.

Refs #8503

Closes #8518
2021-04-22 11:24:49 +03:00
Kamil Braun
e486e0f759 tree-wide: rename "cdc streams timestamp" to "cdc generation id"
Each CDC generation always has a timestamp, but the fact that the
timestamp identifies the generation is an implementation detail.
We abstract away from this detail by using a more generic naming scheme:
a generation "identifier" (whatever that is - a timestamp or something
else).

It's possible that a CDC generation will be identified by more than a
timestamp in the (near) future.

The actual string gossiped by nodes in their application state is left
as "CDC_STREAMS_TIMESTAMP" for backward compatibility.

Some stale comments have been updated.
2021-04-06 13:15:31 +02:00
Nadav Har'El
ccc75bfe2a Merge 'Disable thrift by default' from Piotr Sarna
The Thrift layer is functional, but it's not usually the first-choice protocol for Scylla users, so it's hereby disabled by default.

Fixes #8336

Closes #8338

* github.com:scylladb/scylla:
  docs: mention disabling Thrift by default
  db,config: disable Thrift by default
2021-03-29 12:48:20 +03:00
Michał Chojnowski
8c45225f21 docs: remove the obsolete IMR design note
IMR, as described in this design note, was removed in 001652815c.
This doc should have been removed back then, but was overlooked.

Closes #8340
2021-03-29 10:58:05 +02:00
Piotr Sarna
b774d69ad2 docs: mention disabling Thrift by default
Thrift is no longer enabled by default, so the documentation
should mention that, as well as the suggested way of enabling it
if necessary.
2021-03-22 14:32:51 +01:00
Juliusz Stasiewicz
382545a614 docs: explain SSL/non-SSL and shard-aware CQL ports
I added short description of shard-aware ports + updated the rules
for disabling ports and enabling SSL introduced by #7992.

Fixes #8146

Closes #8152
2021-03-09 22:48:30 +02:00
Calle Wilund
58489dc003 cql3::restrictions: Add SCYLLA_CLUSTERING_BOUND keyword for sstableloader
Refs #8093
Refs /scylladb/scylla-tools-java#218

Adds keyword that can preface value tuples in (a, b, c) > (1, 2, 3)
expressions, forcing the restriction to bypass column sort order
treatment, and instead just create the raw ck bounds accordningly.

This is a very limited, and simple version, but since we only need
to cover this above exact syntax, this should be sufficient.

v2:
* Add small cql test
v3:
* Added comment in multi_column_restriction::slice, on what "mode" means and is for
* Added small document of our internal CQL extension keywords, including this.
v4:
* Added a few more cases to tests to verify multi-column restrictions
* Reworded docs a bit
v5:
* Fixed copy-paste error in comment
v6:
* Added negative (error) test cases
v7:
* Added check + reject of trying to combine SCYLLA_CLUST... slice and
  normal one

Closes #8094
2021-03-03 07:06:45 +01:00
Kamil Braun
9bdd000e97 cdc: rewrite streams to the new description table
Nodes automatically ensure that the latest CDC generation's list of
streams is present in the streams description table. When a new
generation appears, we only need to update the table for this
generation; old generations are already inserted.

However, we've changed the description table (from
`cdc_streams_descriptions` to `cdc_streams_descriptions_v2`). The
existing mechanism only ensures that the latest generation appears in
the new description table. This commit adds an additional procedure that
rewrites the older generations as well, if we find that it is necessary
to do so (i.e. when some CDC log tables may contain data in these
generations).
2021-02-18 11:44:59 +01:00
Kamil Braun
99cc9b8051 docs: cdc: mention system.cdc_local table 2021-02-18 11:44:59 +01:00
Kamil Braun
67d4e5576d sys_dist_ks: split CDC streams table partitions into clustered rows
Until now, the lists of streams in the `cdc_streams_descriptions` table
for a given generation were stored in a single collection. This solution
has multiple problems when dealing with large clusters (which produce
large lists of streams):
1. large allocations
2. reactor stalls
3. mutations too large to even fit in commitlog segments

This commit changes the schema of the table as described in issue #7993.
The streams are grouped according to token ranges, each token range
being represented by a separate clustering row. Rows are inserted in
reasonably large batches for efficiency.

The table is renamed to enable easy upgrade. On upgrade, the latest CDC
generation's list of streams will be (re-)inserted into the new table.

Yet another table is added: one that contains only the generation
timestamps clustered in a single partition. This makes it easy for CDC
clients to learn about new generations. It also enables an elegant
two-phase insertion procedure of the generation description: first we
insert the streams; only after ensuring that a quorum of replicas
contains them, we insert the timestamp. Thus, if any client observes a
timestamp in the timestamps table (even using a ONE query),
it means that a quorum of replicas must contain the list of streams.
2021-02-18 11:44:59 +01:00
Avi Kivity
913d970c64 Merge "Unify inactive readers" from Botond
"
Currently inactive readers are stored in two different places:
* reader concurrency semaphore
* querier cache
With the latter registering its inactive readers with the former. This
is an unnecessarily complex (and possibly surprising) setup that we want
to move away from. This series solves this by moving the responsibility
if storing of inactive reads solely to the reader concurrency semaphore,
including all supported eviction policies. The querier cache is now only
responsible for indexing queriers and maintaining relevant stats.
This makes the ownership of the inactive readers much more clear,
hopefully making Benny's work on introducing close() and abort() a
little bit easier.

Tests: unit(release, debug:v1)
"

* 'unify-inactive-readers/v2' of https://github.com/denesb/scylla:
  reader_concurrency_semaphore: store inactive readers directly
  querier_cache: store readers in the reader concurrency semaphore directly
  querier_cache: retire memory based cache eviction
  querier_cache: delegate expiry to the reader_concurrency_semaphore
  reader_concurrency_semaphore: introduce ttl for inactive reads
  querier_cache: use new eviction notify mechanism to maintain stats
  reader_concurrency_semaphore: add eviction notification facility
  reader_concurrency_semaphore: extract evict code into method evict()
2021-02-03 10:59:04 +02:00
dgarcia360
fd5f0c3034 docs: add organization
Closes #7818
2020-12-22 15:33:31 +02:00