137 lines
5.0 KiB
Markdown
137 lines
5.0 KiB
Markdown
## Service Level Distributed Data
|
|
|
|
There are two system tables that are used to facilitate the service level feature.
|
|
|
|
|
|
### Service Level Attachment Table
|
|
|
|
```
|
|
CREATE TABLE system_auth.role_attributes (
|
|
role text,
|
|
attribute_name text,
|
|
attribute_value text,
|
|
PRIMARY KEY (role, attribute_name))
|
|
```
|
|
The table was created with generality in mind, but its purpose is to record
|
|
information about roles. The table columns meaning are:
|
|
*role* - the name of the role that the attribute belongs to.
|
|
*attribute_name* - the name of the attribute for the role.
|
|
*attribute_value* - the value of the specified attribute.
|
|
|
|
For the service level, the relevant attribute name is `service_level`.
|
|
So for example in order to find out which `service_level` is attached to role `r`
|
|
one can run the following query:
|
|
|
|
```
|
|
SELECT * FROM system_auth.role_attributes WHERE role='r' and attribute_name='service_level'
|
|
|
|
```
|
|
|
|
### Service Level Configuration Table
|
|
|
|
```
|
|
CREATE TABLE system_distributed.service_levels (
|
|
service_level text PRIMARY KEY,
|
|
timeout duration,
|
|
workload_type text)
|
|
```
|
|
|
|
The table is used to store and distribute the service levels configuration.
|
|
The table column names meanings are:
|
|
*service_level* - the name of the service level.
|
|
*timeout* - timeout for operations performed by users under this service level
|
|
*workload_type* - type of workload declared for this service level (unspecified, interactive or batch)
|
|
|
|
```
|
|
select * from system_distributed.service_levels ;
|
|
|
|
service_level | timeout | workload_type
|
|
---------------+---------+---------------
|
|
sl | 500ms | interactive
|
|
|
|
```
|
|
|
|
### Service Level Timeout
|
|
|
|
Service level timeout can be used to assign a default timeout value for all operations for a particular service level.
|
|
|
|
Service level timeout takes precedence over default timeout values from scylla.yaml configuration
|
|
file, but it can still be superseded by per-query timeouts (issuing a query with USING TIMEOUT directive).
|
|
|
|
In order to set a timeout for a service level, create or alter it with proper parameters, e.g.:
|
|
```
|
|
create service level sl with timeout = 50ms;
|
|
list all service levels;
|
|
|
|
service_level | timeout
|
|
---------------+---------
|
|
sl | 50ms
|
|
|
|
```
|
|
|
|
Restoring the default timeout value (from scylla.yaml file) can be done by setting the service level timeout value to null:
|
|
```
|
|
alter service level sl with timeout = null;
|
|
list all service levels;
|
|
|
|
service_level | timeout
|
|
---------------+---------
|
|
sl | null
|
|
|
|
```
|
|
|
|
#### Combining service level timeouts from multiple roles
|
|
|
|
A single role may be granted multiple other roles, which also means that more than one service level may be in effect
|
|
for a particular user. In case of timeouts, multiple timeout values are combined by using a minimum of all effective
|
|
timeouts. Example:
|
|
|
|
role1: `timeout = 1s`
|
|
role2: `timeout = 50ms`
|
|
role3: `timeout = 2s`
|
|
role4: `timeout = 10ms`
|
|
|
|
The granting hierarchy is as follows, with role1 inheriting from role2, which in turn
|
|
inherits from role3 and role4:
|
|
role4 role3
|
|
\ /
|
|
role2
|
|
/
|
|
role1
|
|
|
|
With the following roles granted, here are the effective timeouts for the roles:
|
|
|
|
role1: `timeout = 10ms`
|
|
role2: `timeout = 10ms`
|
|
role3: `timeout = 2s`
|
|
role4: `timeout = 10ms`
|
|
|
|
### Workload types
|
|
|
|
It's possible to declare a workload type for a service level, currently out of three available values:
|
|
1. unspecified - generic workload without any specific characteristics; default
|
|
2. interactive - workload sensitive to latency, expected to have high/unbounded concurrency,
|
|
with dynamic characteristics, OLTP;
|
|
example: users clicking on a website and generating events with their clicks
|
|
3. batch - workload for processing large amounts of data, not sensitive to latency, expected to have
|
|
fixed concurrency, OLAP, ETL;
|
|
example: processing billions of historical sales records to generate useful statistics
|
|
|
|
Declaring a workload type provides more context for Scylla to decide how to handle the sessions.
|
|
For instance, if a coordinator node receives requests with a rate higher than it can handle,
|
|
it will make different decisions depending on the declared workload type:
|
|
- for batch workloads it makes sense to apply backpressure - the concurrency is assumed to be fixed,
|
|
so delaying a reply will likely also reduce the rate at which new requests are sent;
|
|
- for interactive workloads, backpressure would only waste resources - delaying a reply does not
|
|
decrease the rate of incoming requests, so it's reasonable for the coordinator to start shedding
|
|
surplus requests.
|
|
|
|
If multiple workload types are applicable for a role, it makes sense if:
|
|
- all the applicable workload types are identical
|
|
- some of the service levels do not have any workload types specified
|
|
|
|
Otherwise, e.g. if a role has multiple workload types declared,
|
|
the conflicts are resolved as follows:
|
|
- `X` vs `unspecified` -> `X`
|
|
- `batch` vs `interactive` -> `batch` - under the assumption that `batch` is safer, because it would not trigger load shedding as eagerly as `interactive`
|