Commit Graph

79 Commits

Author SHA1 Message Date
Eliran Sinvani
e0c7178e75 query_processor: remove default internal query caching behavior
When executing internal queries, it is important that the developer
will decide if to cache the query internally or not since internal
queries are cached indefinitely. Also important is that the programmer
will be aware if caching is going to happen or not.
The code contained two "groups" of `query_processor::execute_internal`,
one group has caching by default and the other doesn't.
Here we add overloads to eliminate default values for caching behaviour,
forcing an explicit parameter for the caching values.
All the call sites were changed to reflect the original caching default
that was there.

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
2022-05-01 08:33:55 +03:00
Eliran Sinvani
38b7ebf526 query_processor: make execute_internal caching parameter more verbose
`execute_internal` has a parameter to indicate if caching a prepared
statement is needed for a specific call. However this parameter was a
boolean so it was easy to miss it's meaning in the various call sites.
This replaces the parameter type to a more verbose one so it is clear
from the call site what decision was made.
2022-05-01 08:33:55 +03:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Avi Kivity
ae3a360725 database: Move database, keyspace, table classes to replica/ directory
The database, keyspace, and table classes represent the replica-only
part of the objects after which they are named. Reading from a table
doesn't give you the full data, just the replica's view, and it is not
consistent since reconciliation is applied on the coordinator.

As a first step in acknowledging this, move the related files to
a replica/ subdirectory.
2022-01-06 17:07:30 +02:00
Avi Kivity
d768e9fac5 cql3, related: switch to data_dictionary
Stop using database (and including database.hh) for schema related
purposes and use data_dictionary instead.

data_dictionary::database::real_database() is called from several
places, for these reasons:

 - calling yet-to-be-converted code
 - callers with a legitimate need to access data (e.g. system_keyspace)
   but with the ::database accessor removed from query_processor.
   We'll need to find another way to supply system_keyspace with
   data access.
 - to gain access to the wasm engine for testing whether used
   defined functions compile. We'll have to find another way to
   do this as well.

The change is a straightforward replacement. One case in
modification_statement had to change a capture, but everything else
was just a search-and-replace.

Some files that lost "database.hh" gained "mutation.hh", which they
previously had access to through "database.hh".
2021-12-15 13:54:23 +02:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Pavel Solodovnikov
e0749d6264 treewide: some random header cleanups
Eliminate not used includes and replace some more includes
with forward declarations where appropriate.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2021-06-06 19:18:49 +03:00
Piotr Sarna
c5214eb096 treewide: remove timeout config from query options
Timeout config is now stored in each connection, so there's no point
in tracking it inside each query as well. This patch removes
timeout_config from query_options and follows by removing now
unnecessary parameters of many functions and constructors.
2021-02-25 17:20:27 +01:00
Piotr Jastrzebski
c001374636 codebase wide: replace count with contains
C++20 introduced `contains` member functions for maps and sets for
checking whether an element is present in the collection. Previously
`count` function was often used in various ways.

`contains` does not only express the intend of the code better but also
does it in more unified way.

This commit replaces all the occurences of the `count` with the
`contains`.

Tests: unit(dev)

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <b4ef3b4bc24f49abe04a2aba0ddd946009c9fcb2.1597314640.git.piotr@scylladb.com>
2020-08-15 20:26:02 +03:00
Rafael Ávila de Espíndola
a4916ce553 auth: Turn DEFAULT_USER_NAME into a std::string_view variable
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-08-04 16:40:00 -07:00
Rafael Ávila de Espíndola
61de1fe752 auth: Turn SALTED_HASH into a std::string_view variable
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-08-04 16:40:00 -07:00
Rafael Ávila de Espíndola
cb4c3e45d5 auth: Turn meta::roles_table::qualified_name into a std::string_view variable
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-08-04 16:40:00 -07:00
Rafael Ávila de Espíndola
27c2b3de30 auth: Turn password_authenticator_name into a std::string_view variable
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-08-04 16:40:00 -07:00
Rafael Ávila de Espíndola
400212e81f auth: Convert sstring variables in common.hh to constexpr std::string_view
This converts the following variables:
DEFAULT_SUPERUSER_NAME AUTH_KS USERS_CF AUTH_PACKAGE_NAME

Since they are now constexpr they will not be part of any
initialization order problems.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-07-03 12:35:58 -07:00
Avi Kivity
88ade3110f treewide: replace calls to engine().some_api() with some_api()
This removes the need to include reactor.hh, a source of compile
time bloat.

In some places, the call is qualified with seastar:: in order
to resolve ambiguities with a local name.

Includes are adjusted to make everything compile. We end up
having 14 translation units including reactor.hh, primarily for
deprecated things like reactor::at_exit().

Ref #1
2020-04-05 12:46:04 +03:00
Rafael Ávila de Espíndola
eca0ac5772 everywhere: Update for deprecated apply functions
Now apply is only for tuples, for varargs use invoke.

This depends on the seastar changes adding invoke.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200324163809.93648-1-espindola@scylladb.com>
2020-03-25 08:49:53 +02:00
Rafael Ávila de Espíndola
c29f8caafc auth: Return a string_view from authenticator::qualified_java_name
This gives more flexibility to the implementations as they now don't
need to construct a sstring.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-02-28 11:32:36 -08:00
Rafael Ávila de Espíndola
e670dfc0cd auth: Fix static initialization order problem
A static constructor was used to initialize update_row_query. That
constructor would call meta::roles_table::qualified_name() which would
access AUTH_KS which is also initialized by a static constructor in
another file, so the construction order is not guaranteed.

This change turns update_row_query into a function with a static local
variable in it. The static local is initialized at first use, fixing
the problem.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200227163916.19761-1-espindola@scylladb.com>
2020-02-28 07:57:13 +02:00
Konstantin Osipov
93db4d748c query_processor: fold one execute_internal() into another.
All internal execution always uses query text as a key in the
cache of internal prepared statements. There is no need
to publish API for executing an internal prepared statement object.

The folded execute_internal() calls an internal prepare() and then
internal execute().
execute_internal(cache=true) does exactly that.
2020-02-12 16:44:12 +03:00
Avi Kivity
632c7c303a Merge "auth: Restructure SASL code" from Jesse
"
This series restructures the SASL code that was previously internal
to the `password_authenticator` so that it can be used in other contexts.
"

* 'jhk/restructure_sasl/v1' of https://github.com/hakuch/scylla:
  auth: Rename SASL challenge class for "PLAIN"
  auth: Make a ctor `explicit`
  auth: Move `sasl_challenge` to its own file
  auth: Decouple SASL code from its parent class
2019-02-28 10:19:41 +02:00
Jesse Haber-Kucharsky
f2d92f81e8 auth: Report a more specific error with bad creds
Without this change, the resulting error message for an invalid password
is "authentication failed".

With this change, we report "Username and/or password are incorrect".

Fixes #4285

Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com>
Message-Id: <32d00be8af5075ee10d2c14f85b76843a9adac10.1551306914.git.jhaberku@scylladb.com>
2019-02-28 09:53:57 +02:00
Jesse Haber-Kucharsky
3d883e8cf2 auth: Rename SASL challenge class for "PLAIN" 2019-02-27 18:36:58 -05:00
Jesse Haber-Kucharsky
dc41f1098b auth: Move sasl_challenge to its own file
This will allow for other authenticators other than
`password_authenticator` from making use of the PLAIN SASL
authentication code.
2019-02-27 18:36:52 -05:00
Jesse Haber-Kucharsky
2d59fa6be9 auth: Decouple SASL code from its parent class
This way, we can (in the future) use this implementation of the SASL
"PLAIN" mechanism in other contexts other than `password_authenticator`.
2019-02-27 18:11:31 -05:00
Avi Kivity
da9628c6dc auth: password_authenticator: protect against NULL salted_hash
In case salted_hash was NULL, we'd access uninitialized memory when dereferencing
the optional in get_as<>().

Protect against that by using get_opt() and failing authentication if we see a NULL.

Fixes #4168.

Tests: unit (release)
Branches: 3.0, 2.3
Message-Id: <20190211173820.8053-1-avi@scylladb.com>
2019-02-11 18:54:03 +01:00
Duarte Nunes
fa2b0384d2 Replace std::experimental types with C++17 std version.
Replace stdx::optional and stdx::string_view with the C++ std
counterparts.

Some instances of boost::variant were also replaced with std::variant,
namely those that called seastar::visit.

Scylla now requires GCC 8 to compile.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20190108111141.5369-1-duarte@scylladb.com>
2019-01-08 13:16:36 +02:00
Avi Kivity
c3ef99f84f schema_tables: remove #include of database.hh
Distribute in source files (and one header - table_helper.hh) that need it.
2019-01-05 15:43:07 +02:00
Avi Kivity
30745eeb72 query_processor: replace sharded<database> with the local shard
query_processor uses storage_proxy to access data, and the local
database object to access replicated metadata. While it seems strange
that the database object is not used to access data, it is logical
when you consider that a sharded<database> only contain's this node's
data, not the cluster data.

Take advantage of this to replace sharded<database> with a single database
shard.
2018-12-29 11:02:15 +02:00
Piotr Sarna
7b0a3fbf8a auth: add abort_source to waiting for schema agreement
When the auth service is requested to stop during bootstrap,
it might have still not reached schema agreement.
Currently, waiting for this agreement is done in an infinite loop,
without taking abort_source into account.
This patch introduces checking if abort was requested
and breaking the loop in such case, so auth service can terminate.

Tests:
unit (release)
dtest (bootstrap_test.py:TestBootstrap.shutdown_wiped_node_cannot_join_test)
Message-Id: <1b7ded14b7c42254f02b5d2e10791eb767aae7fc.1543914769.git.sarna@scylladb.com>
2018-12-04 10:41:09 +00:00
Avi Kivity
eb74fe784d auth: convert sprint() to format()
sprint() recently became more strict, throwing on sprint("%s", 5). Replace
with the more modern format().

Mechanically converted with https://github.com/avikivity/unsprint.
2018-11-01 13:16:17 +00:00
Jesse Haber-Kucharsky
9d27045c76 auth: Shorten random_device instance life-span
On Fedora 28, creating an instance of `std::random_device` opens a file
descriptor for `/dev/urandom` (observed via `strace`).

By declaring static thread-local instances of `std::random_device`,
these descriptors will be open (barring optimization by the compiler)
for the entire duration of the Scylla process's life.

However, the `std::random_device` instance is only necessary for
initializing the `RandomNumberEngine` for generating salts. With this
change, the file-descriptor is closed immediately after the engine is
initialized.

I considered generalizing this pattern of initialization into a
function, but with only two uses (and simple ones) I think this would
only obscure things.

Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com>
Tests: unit (release)
Message-Id: <f1b985d99f66e5e64d714fd0f087e235b71557d2.1536697368.git.jhaberku@scylladb.com>
2018-09-12 12:14:21 +01:00
Jesse Haber-Kucharsky
52d3ff057a auth: Allow different random engines for salt
This makes the function useable in more contexts due to
flexibility (including in tests), since the state is not captured and
the characteristics of salt generation can be customized to the caller's
needs.
2018-08-13 13:24:45 -04:00
Jesse Haber-Kucharsky
fd60d61ebf auth: Split out test for best supported scheme
The `generate_salt` function invokes this function internally now.

This change means that `generate_salt` is now thread-safe and therefore
does not have to be invoked by a single thread only when starting the
`password_authenticator`.

This further means that `generate_salt` does not need to be part of the
public interface of the module, and can be moved to the implementation
file.
2018-08-13 13:24:45 -04:00
Jesse Haber-Kucharsky
adf058bd1f auth: Rename function to use full words 2018-08-13 13:24:45 -04:00
Jesse Haber-Kucharsky
b272d622f8 auth: Move passsword stuff to its own namespace
For clarity and nicer function names.
2018-08-13 13:24:45 -04:00
Jesse Haber-Kucharsky
2a40bcb281 auth: Move password handling to its own files
While the `password_authenticator` is a complex component with lots of
dependencies, password hashing and checking itself is a process with
limited logical state and dependencies, which makes it easy to isolate
and test.
2018-08-13 13:24:45 -04:00
Jesse Haber-Kucharsky
03cf57db62 auth: Construct std::random_device instances once
`std::random_device` has a lot of implementation-specific behavior, and
as a result we cannot assume much about its performance characteristics.

We initialize thread-specific static instances of `std::random_device`
once so that we don't have the overhead of invoking the ctor during
every invocation of `gensalt`.
2018-08-13 13:24:45 -04:00
Jesse Haber-Kucharsky
fce10f2c6e auth: Don't use unsupported hashing algorithms
In previous versions of Fedora, the `crypt_r` function returned
`nullptr` when a requested hashing algorithm was not supported.

This is consistent with the documentation of the function in its man
page.

As of Fedora 28, the function's behavior changes so that the encrypted
text is not `nullptr` on error, but instead the string "*0".

The info pages for `crypt_r` clarify somewhat (and contradict the man
pages):

    Some implementations return `NULL` on failure, and others return an
    _invalid_ hashed passphrase, which will begin with a `*` and will
    not be the same as SALT.

Because of this change of behavior, users running Scylla on a Fedora 28
machine which was upgraded from a previous release would not be able to
authenticate: an unsupported hashing algorithm would be selected,
producing encrypted text that did not match the entry in the table.

With this change, unsupported algorithms are correctly detected and
users should be able to continue to authenticate themselves.

Fixes #3637.

Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com>
Message-Id: <bcd708f3ec195870fa2b0d147c8910fb63db7e0e.1533322594.git.jhaberku@scylladb.com>
2018-08-05 08:57:36 +03:00
Jesse Haber-Kucharsky
e664f9b0c6 Use finite time-outs for internal auth. queries 2018-07-31 11:38:16 -04:00
Nadav Har'El
25bd139508 cross-tree: clean up use of std::random_device()
std::random_device() uses the relatively slow /dev/urandom, and we rarely if
ever intend to use it directly - we normally want to use it to seed a faster
random_engine (a pseudo-random number generator).

In many places in the code, we first created a random_device variable, and then
using it created a random_engine variable. However, this practice created the
risk of a programmer accidentally using the random_device object, instead of the
random_engine object, because both have the same API; This hurts performance.

This risk materialized in just two places in the code, utils/uuid.cc and
gms/gossiper.cc. A patch for to uuid.cc was sent previously by Pawel and is
not included in this patch, and the fix for gossiper.{cc,hh} is included here.

To avoid risking the same mistake in the future, this patch switches across the
code to an idiom where the random_device object is not *named*, so cannot be
accidentally used. We use the following idiom:

   std::default_random_engine _engine{std::random_device{}()};

Here std::random_device{}() creates the random device (/dev/urandom) and pulls
a random integer from it. It then uses this seed to create the random_engine
(the pseudo-random number generator). The std::random_device{} object is
temporary and unnamed, and cannot be unintentionally used directly.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20180726154958.4405-1-nyh@scylladb.com>
2018-07-26 16:54:58 +01:00
Avi Kivity
187ebdbe46 auth: fix possible use of disengaged optional in has_salted_hash()
untyped_result_set_row's cell data type is bytes_opt, and the
get_block() accessor accesses the value assuming it's engaged
(relying on the caller to call has()).

has_unsalted_hash() calls get_blob() without calling has() beforehand,
potentially triggering undefined behavior.

Fix by using get_or() instead, which also simplifies the caller.

I observed failures in Jenkins in this area. It's hard to be sure
this is the root cause, since the failures triggered an internal
consistency assertion in asan rather than an asan report. However,
the error is hard to reproduce and the fix makes sense even if it
doesn't prevent the error.

See #3480 for the asan error.

Fixes #3480 (hopefully).
Message-Id: <20180602181919.29204-1-avi@scylladb.com>
2018-06-02 19:46:32 +01:00
Avi Kivity
a99e820bb9 query_processor: require clients to specify timeout configuration
Remove implicit timeouts and replace with caller-specified timeouts.
This allows removing the ambiguity about what timeout a statement is
executed with, and allows removing cql_statement::execute_internal(),
which mostly overrode timeouts and consistency levels.

Timeout selection is now as follows:

  query_processor::*_internal: infinite timeout, CL=ONE
  query_processor::process(), execute(): user-specified consisistency level and timeout

All callers were adjusted to specify an infinite timeout. This can be
further adjusted later to use the "other" timeout for DCL and the
read or write timeout (as needed) for authentication in the normal
query path.

Note that infinite timeouts don't mean that the query will hang; as
soon as the failure detector decides that the node is down, RPC
responses will termiante with a failure and the query will fail.
2018-05-14 09:41:06 +03:00
Jesse Haber-Kucharsky
cd0553ca6a auth: Query custom options from the authenticator
None of the `authenticator` implementations we have support custom
options, but we should support this operation to support the relevant
CQL statements.
2018-05-09 21:12:50 -04:00
Jesse Haber-Kucharsky
00f7bc676d auth: Remove ordering dependence
If `auth::password_authenticator` also creates `system_auth.roles` and
we fix the existence check for the default superuser in
`auth::standard_role_manager` to only search for the columns that it
owns (instead of the column itself), then both modules' initialization
are independent of one another.

Fixes #3319.
2018-03-25 22:38:11 -04:00
Jesse Haber-Kucharsky
881656cea4 auth: Wait for schema agreement
Some modules of `auth` create a default superuser if it does not already
exist.

The existence check is through a SELECT query with quorum consistency
level. If the schema for the applicable tables has not yet propagated to
a peer node at the time that it processes this query, then the
`storage_proxy` will print an error message to the log and the query
will be retried.

Eventually, the schema will propagate and the default superuser will be
created. However, the error message in the log causes integration tests
to fail (and is somewhat annoying).

Now, prior to querying for existing data, we wait for all gossip peers
to have the same schema version as we do.

Fixes #2852.
2018-03-25 22:38:08 -04:00
Jesse Haber-Kucharsky
9117a689cf auth: Fix const correctness
This patch came about because of an important (and obvious, in
hindsight) realization: instances of the authorizer, role manager, and
authenticator are clients for access-control state and not the state
itself. This is reflected directly in Scylla: `auth::service` is
sharded across cores and this is possible because each instance queries
and modifies the same global state.

To give more examples, the value of an instance of `std::vector<int>` is
the structure of the container and its contents. The value of `int
file_descriptor` is an identifier for state maintained elsewhere.

Having watched an excellent talk by Herb Sutter [1] and having read an
informative blog post [2], it's clear that a member function marked
`const` communicates that the observable state of the instance is not
modified.

Thus, the member functions of the role-manager, authenticator, and
authorizer clients should not be marked `const` only if the state of the
client itself is observably changed. By this principle, member functions
which do not change the state of the client, but which mutate the global
state the client is associated with (for example, by creating a role)
are marked `const`.

The `start` (and `stop`) functions of the client have the dual role of
initializing (finalizing) both the local client state and the
external state; they are not marked `const`.

[1] https://herbsutter.com/2013/01/01/video-you-dont-know-const-and-mutable/

[2] http://talesofcpp.fusionfenix.com/post-2/episode-one-to-be-or-not-to-be-const
2018-03-14 01:32:43 -04:00
Jesse Haber-Kucharsky
fbc97626c4 auth: Migrate legacy data on boot
This change allows for seamless migration of the legacy users metadata
to the new role-based metadata tables. This process is summarized in
`docs/migrating-from-users-to-roles.md`.

In general, if any nondefault metadata exists in the new tables, then
no migration happens. If, in this case, legacy metadata still exists
then a warning is written to the log.

If no nondefault metadata exists in the new tables and the legacy tables
exist, then each node will copy the data from the legacy tables to the
new tables, performing transformations as necessary. An informational
message is written to the log when the migration process starts, and
when the process ends. During the process of copying, data is
overwritten so that multiple nodes racing to migrate data do not
conflict.

Since Apache Cassandra's auth. schema uses the same table for managing
roles and authentication information, some useful functions in
`roles-metadata.hh` have been added to avoid code duplication.

Because a superuser should be able to drop the legacy users tables from
`system_auth` once the cluster has migrated to roles and is functioning
correctly, we remove the restriction on altering anything in the
"system_auth" keyspace. Individual tables in `system_auth` are still
protected later in the function.

When a cluster is upgrading from one that does not support roles to one
that does, some nodes will be running old code which accesses old
metadata and some will be running new code which access new metadata.

With the help of the gossiper `feature` mechanism, clients connecting to
upgraded nodes will be notified (through code in the relevant CQL
statements) that modifications are not allowed until the entire cluster
has upgraded.
2018-02-14 14:15:59 -05:00
Jesse Haber-Kucharsky
cf5f6aa4c5 auth: Fix fragile variable life-times
According to the Seastar convention, a parameter passed to a function
taking a reference parameter must live for the duration of the execution
of the returned future.

When possible, variables are statically allocated. When this is not
possible, we use `do_with`.
2018-02-14 14:15:59 -05:00
Jesse Haber-Kucharsky
45631604b0 auth: Use string_view for paramters 2018-02-14 14:15:59 -05:00
Jesse Haber-Kucharsky
c4f686c10f auth: Put definitions inside namespace 2018-02-14 14:15:59 -05:00