Commit Graph

28 Commits

Author SHA1 Message Date
Andrzej Jackowski
f8156702de tree: add missing -present to copyright headers
~2076 files used "Copyright (C) YYYY-present ScyllaDB" while
~88 files used "Copyright (C) YYYY ScyllaDB". This
inconsistency leads to unnecessary code review discussions
and gradual spread of the less common format.

Standardize all ScyllaDB copyright headers to use -present.

Fixes SCYLLADB-1984

Closes scylladb/scylladb#29876
2026-05-21 10:57:42 +02:00
Andrzej Jackowski
78bd361919 audit: refresh rule caches on schema, role, and config changes
Schema, role, and config changes must refresh the
preprocessed rule cache, otherwise the fast path serves
stale matches after reconfiguration or metadata changes.

Register a migration listener for table/view create/drop.
Observe audit_rules config changes through a serialized
action so concurrent rebuilds collapse. Add hooks for role
create/drop and a set_known_entities() bulk-load method.
Implement real cleanup in shutdown() (previously a no-op)
and roll back cleanly on start failure.

Refs SCYLLADB-1430
2026-05-20 06:55:15 +02:00
Andrzej Jackowski
465f8f4d8d audit: route matching rules to configured sinks
Rule-based routing must coexist with legacy
category/keyspace/table filtering so operators who have
not opted into rules keep their existing behavior.

Merge rule-matched sinks into the event's sink set
alongside legacy matches. Add a username parameter to
should_log_login/sinks_for_login so rules can match the
authenticated role. Use a conservative over-approximation
for the fast will-log check since the role is not yet
known at that call site. Log an error at startup when
rules reference sinks not enabled globally. Log a warning
when rules are configured but audit is disabled.

Refs SCYLLADB-1430
2026-05-20 06:55:15 +02:00
Andrzej Jackowski
6354daa8d7 audit: pass sink targets to storage helpers
Per-rule routing needs each audit event to carry its
target sinks so storage helpers can self-filter without
duplicating writes.

Replace should_log() with sinks_for() returning an
audit_sink_set and add sinks_for_login() for the login
path. Move the early-return filtering check from the
static inspect() caller into audit::log() so it uses the
new sinks_for() directly. Pass the sink set to
storage_helper::write() so each helper only fires when its
sink is included. Rename parse_audit_modes to
parse_audit_sinks.

Refs SCYLLADB-1430
2026-05-20 06:55:15 +02:00
Andrzej Jackowski
32cfa778f7 audit: define audit_rule type with parsing and validation
Audit rules provide more granular control over which
statements are audited, filtering by tables, roles, and
categories. Typos in sink or category names should be
caught at parse time rather than silently disabling rules
at runtime.

Define the audit_rule struct with JSON parsing, validation
of sink and category names, serialization, and fmt support.
Move statement_category, category_set, and
category_to_string out of audit.hh/audit.cc so the rule
type is self-contained.

Refs SCYLLADB-1430
2026-05-20 06:55:14 +02:00
Andrzej Jackowski
3755c370ac audit: assert storage ordering invariants at runtime
Abort if audit storage fails to start rather than silently
running with an unaudited maintenance socket. Also assert
that storage is already stopped when the audit service is
destroyed, documenting the defer-stack ordering requirement.

Refs SCYLLADB-1615
Refs SCYLLADB-1695
2026-04-28 18:58:49 +02:00
Andrzej Jackowski
bc67dd0b82 audit: split startup into construction and storage phases
The table-based audit backend needs Raft to create its keyspace,
but the audit service must exist earlier so that CQL paths don't
silently skip auditing.

Split startup into two phases: construction and storage
initialization.  Queries arriving between the two phases are
logged as errors.

This is a refactoring commit and the split sections will be
moved later in this patch series.

Refs SCYLLADB-1615
2026-04-28 18:58:42 +02:00
Botond Dénes
3aced88586 Merge 'audit: decrease allocations / instructions on will_log() fast path' from Marcin Maliszkiewicz
Audit::will_log() runs on every CQL/Alternator request. Since
9646ee05bd it constructs three temporary sstrings per call to look up
the audited keyspaces set / tables map with std::string_view keys,
costing ~180 insns/op and 2 allocations if sstring misses SSO.

This series switches the containers to std::less<> comparators to
enable heterogeneous lookup, then drops the sstring temporaries from
will_log().

perf-simple-query --smp 1 --duration 15 --audit "table"
                  --audit-keyspaces "ks-non-existing"
                  --audit-categories "DCL,DDL,AUTH,DML,QUERY"

  baseline         3d0582d51e          36777 insns/op
  regression     9646ee05bd          36952        (+175)
  this series                                      36768        (-184, fixed)

Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1616
Backport: no, offending commit is not backported

Closes scylladb/scylladb#29565

* github.com:scylladb/scylladb:
  audit: drop sstring temporaries on the will_log() fast path
  audit: enable heterogeneous lookup on audited keyspaces/tables
2026-04-22 15:46:16 +03:00
Marcin Maliszkiewicz
c136b2e640 audit: drop sstring temporaries on the will_log() fast path
audit::will_log() is called for every CQL/Alternator request. With
non-empty keyspace it does:

    _audited_keyspaces.find(sstring(keyspace))
    should_log_table(sstring(keyspace), sstring(table))

constructing three temporary sstrings from the std::string_view
arguments on every call. Now that the underlying associative containers
use std::less<> as comparator (previous commit), find() accepts the
string_view directly. Switch should_log_table() to take string_view as
well so the temporaries disappear entirely.

For short keyspace names the temporaries stay in SSO so allocs/op is
unchanged at 58.1, but each construction still costs ~60 instructions.

perf-simple-query --smp 1 --duration 15 --audit "table"
                  --audit-keyspaces "ks-non-existing"
                  --audit-categories "DCL,DDL,AUTH,DML,QUERY"

build: --mode=release --use-profile="" (no PGO)

Before (regression introduced in 9646ee05bd):
    instructions_per_op: 36952

After:
    instructions_per_op: 36768

Brings insns/op back to the pre-regression baseline 3d0582d51e
(insns/op ~36777) within the per-run noise of ~15 insns standard
deviation, eliminating the ~180 insns/op regression.

Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1616
2026-04-20 15:18:22 +02:00
Marcin Maliszkiewicz
724b9e66ea audit: enable heterogeneous lookup on audited keyspaces/tables
Replace the bare std::set<sstring>/std::map<sstring, std::set<sstring>>
member types with named aliases that use std::less<> as the comparator.
The transparent comparator enables heterogeneous lookup with
string_view keys.

This commit is a pure refactor with no behavioral change: the parser
return types, constructor parameters, observer template instantiations,
and start_audit() locals are all updated to use the aliases.
2026-04-20 15:14:58 +02:00
Piotr Szymaniak
caaef45b7a audit: restore static_cast for batch inspect
Closes scylladb/scylladb#29545
2026-04-17 23:11:18 +03:00
Piotr Szymaniak
4c93c2af62 audit/alternator: support audit_tables=alternator.<table> shorthand
The real keyspace name of an Alternator table T is "alternator_T".
Expand the "alternator.T" format used in the audit_tables config flag
to the real keyspace name at parse time, so users don't need to spell
out the internal "alternator_T.T" form.
2026-04-15 12:29:15 +02:00
Piotr Szymaniak
9646ee05bd audit/alternator: Refactor in preparation for auditing Alternator
Prepare API in audit for auditing Alternator.
The API provides an externally-callable functions `inspect()`,
for both CQL and Alternator.
Both variants of the function would unpack parameters and merge into
calling a common `maybe_log()`, which can then call `log()` when
conditions are met.
Also, while I was at it, (const) references were favoured over raw
pointers.

The Alternator audit_info subclass (audit_info_alternator) carries an
optional consistency level — only data read/write operations have a
meaningful CL, while DDL and metadata queries store an empty string
in the audit table and syslog (matching the existing write_login
behavior). The storage helpers are updated accordingly.

Add a will_log(category, keyspace, table) method that checks whether
an operation should be audited (category check AND keyspace/table
filtering) without requiring a constructed audit_info object.
should_log() delegates to will_log().
2026-04-15 11:46:44 +02:00
Avi Kivity
0ae22a09d4 LICENSE: Update to version 1.1
Updated terms of non-commercial use (must be a never-customer).
2026-04-12 19:46:33 +03:00
Marcin Maliszkiewicz
19af46d83a audit: replace batch dynamic_cast with static_cast
Since we know already it's a batch we can use static
cast now.
2026-01-26 18:14:38 +01:00
Marcin Maliszkiewicz
6f32290756 audit: eliminate dynamic_cast to batch_statement in inspect
This is costly and not needed we can use a simple
bool flag for such check. It burns around 300 cpu
instructions on a hot request's path.
2026-01-26 10:18:38 +01:00
Marcin Maliszkiewicz
a93ad3838f audit: cql: remove create_no_audit_info
We don't need a special guard value, it's
only being filled for batch statements for
which we can simply ignore the value.

Not having special value allows us to return
fast when audit is not enabled.
2026-01-26 10:18:38 +01:00
Marcin Maliszkiewicz
02d59a0529 audit: add batch bool to audit_info class
In the following commit we'll use this field
instead of costly dynamic_cast when emitting
audit log.
2026-01-26 10:18:38 +01:00
Dario Mirovic
afca230890 audit: write out to both table and syslog
This patch adds support for multiple audit log outputs.
If only one audit log output is enabled, the behavior does not change.
If multiple audit log outputs are enabled, then the
`audit_composite_storage_helper` class is used. It has a collection
of `storage_helper` objects.

Fixes #26022
2025-11-10 00:31:30 +01:00
Dario Mirovic
c3a673d37f audit: move storage helper creation from audit::start to audit::audit
Extract storage helper creation into `create_storage_helper` function.
Call this function from `audit::audit`. It will be called per shard inside
`sharded<audit>::start` method.

Refs #26022
2025-11-06 03:05:43 +01:00
Dario Mirovic
28c1c0f78d audit: fix formatting in audit::start_audit
Refs #26022
2025-11-06 03:05:17 +01:00
Dario Mirovic
549e6307ec audit: unify create_audit and start_audit
There is no need to have `create_audit` separate from `start_audit`.
`create_audit` just stores the passed parameters, while `start_audit`
does the actual initialization and startup work.

Refs #26022
2025-11-06 03:05:06 +01:00
Dario Mirovic
666364f651 audit: introduce debug level logs on happy path
Audit component defines `audit` logger which it uses only for `error` and `info` logs,
regarding `audit` module initialization and errors during audit log writing.
This change introduces `debug` level logs on the happy path of audit log writes.

Ref: scylladb/scylladb#23773
2025-06-27 16:27:27 +02:00
Andrzej Jackowski
5651cc49ed audit: make categories, tables, and keyspaces liveupdatable
This change:
 - Set liveness::LiveUpdate for audit_categories, audit_tables,
   and audit_keyspaces
 - Keep const reference to db::config in audit, so current config values
   can be obtained by audit implementation
 - Implement function audit::update_config to parse given string, update
   audit datastructures when needed, and log the changes.
 - Add observers to call audit::update_config when categories,
   tables, or keyspaces configuration changes

Fixes scylladb/scylla-enterprise#1789
2025-01-27 11:37:13 +01:00
Andrzej Jackowski
5d4eb5d2dc audit: move static parsing functions above audit constructors
This change:
 - Swap static function and audit constructors in audit.cc

This is a preparatory commit for enabling liveupdate of audit
categories, tables, and keyspaces. It allows future use of static
parsing functions in audit constructor.
2025-01-27 11:35:35 +01:00
Andrzej Jackowski
609d7b2725 audit: move statement_category to string conversion to static function
This change:
 - Move audit_info::category_string to a new static function
 - Start using the new function in audit_info::category_string

This is a preparatory commit for enabling liveupdate of audit
categories, tables, and keyspaces. The newly created static function
will be required for proper logging of audit categories.
2025-01-27 11:35:35 +01:00
Andrzej Jackowski
99b4a79df0 audit: start audit even with empty categories/tables/keyspaces
This change:
 - Remove code that prevented audit from starting if audit_categories,
   audit_tables, and audit_keyspaces are not configured

This is a preparatory commit for enabling liveupdate of audit
categories, tables, and keyspaces. Without this change, audit is
not started for particular categories/tables/keyspaces setting and
it is unwanted behavior if customer can change audit configuration via
liveupdate.

This commit has performance implications if audit sink is set (meaning
"audit"="table" or "audit"="syslog" in the config) but categories,
tables, and keyspaces are not set to audit anything. Before this commit,
audit was not started, so some operations (like creating audit_info or
lookup in empty collections) were omitted.
2025-01-27 11:35:35 +01:00
Paweł Zakrzewski
384641194a audit: Add the audit subsystem
This change introduces a new audit subsystem that allows tracking and logging of database operations for security and compliance purposes. Key features include:

- Configurable audit logging to either syslog or a dedicated system table (audit.audit_log)
- Selective auditing based on:
  - Operation categories (QUERY, DML, DDL, DCL, AUTH, ADMIN)
  - Specific keyspaces
  - Specific tables
- New configuration options:
  - audit: Controls audit destination (none/syslog/table)
  - audit_categories: Comma-separated list of operation categories to audit
  - audit_tables: Specific tables to audit
  - audit_keyspaces: Specific keyspaces to audit
  - audit_unix_socket_path: Path for syslog socket
  - audit_syslog_write_buffer_size: Buffer size for syslog writes

The audit logs capture details including:
- Operation timestamp
- Node and client IP addresses
- Operation category and query
- Username
- Success/failure status
- Affected keyspace and table names
2025-01-15 11:10:35 +01:00