scylladb

Author	SHA1	Message	Date
Avi Kivity	eb74fe784d	auth: convert sprint() to format() sprint() recently became more strict, throwing on sprint("%s", 5). Replace with the more modern format(). Mechanically converted with https://github.com/avikivity/unsprint.	2018-11-01 13:16:17 +00:00
Duarte Nunes	e46ef6723b	Merge seastar upstream * seastar d152f2d...c1e0e5d (6): > scripts: perftune.py: properly merge parameters from the command line and the configuration file > fmt: update to 5.2.1 > io_queue: only increment statistics when request is admitted > Adds `read_first_line.cc` and `read_first_line.hh` to CMake. > fstream: remove default extent allocation hint > core/semaphore: Change the access of semaphore_units main ctor Due to a compile-time fight between fmt and boost::multiprecision, a lexical_cast was added to mediate. sprint("%s", var) no longer accepts numeric values, so some sprint()s were converted to format() calls. Since more may be lurking we'll need to remove all sprint() calls. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-10-25 12:53:30 +03:00
Jesse Haber-Kucharsky	9d27045c76	auth: Shorten `random_device` instance life-span On Fedora 28, creating an instance of `std::random_device` opens a file descriptor for `/dev/urandom` (observed via `strace`). By declaring static thread-local instances of `std::random_device`, these descriptors will be open (barring optimization by the compiler) for the entire duration of the Scylla process's life. However, the `std::random_device` instance is only necessary for initializing the `RandomNumberEngine` for generating salts. With this change, the file-descriptor is closed immediately after the engine is initialized. I considered generalizing this pattern of initialization into a function, but with only two uses (and simple ones) I think this would only obscure things. Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com> Tests: unit (release) Message-Id: <f1b985d99f66e5e64d714fd0f087e235b71557d2.1536697368.git.jhaberku@scylladb.com>	2018-09-12 12:14:21 +01:00
Jesse Haber-Kucharsky	682805b22c	auth: Use finite time-out for all QUORUM reads Commit `e664f9b0c6` transitioned internal CQL queries in the auth. sub-system to be executed with finite time-outs instead of infinite ones. It should have also modified the functions in `auth/roles-metadata.cc` to have finite time-outs. This change fixes some previously failing dtests, particularly around repair. Without this change, the QUORUM query fails to terminate when the necessary consistency level cannot be achieved. Fixes #3736. Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com> Message-Id: <e244dc3e731b4019f3be72c52a91f23ee4bb68d1.1536163859.git.jhaberku@scylladb.com>	2018-09-05 21:55:26 +03:00
Jesse Haber-Kucharsky	b95bbb2e72	auth: Clean up implementation comments	2018-08-13 13:24:45 -04:00
Jesse Haber-Kucharsky	9519a03351	auth: Remove unnecessary local variable The variable could be declared `const`, but removing it outright seems more clear and this way we don't have to come up with a name.	2018-08-13 13:24:45 -04:00
Jesse Haber-Kucharsky	52d3ff057a	auth: Allow different random engines for salt This makes the function useable in more contexts due to flexibility (including in tests), since the state is not captured and the characteristics of salt generation can be customized to the caller's needs.	2018-08-13 13:24:45 -04:00
Jesse Haber-Kucharsky	836fd954e1	auth: Correct modulo bias in salt generation Instead of reducing the large value via `%`, which can produce non-uniformly distributed values when the range is small, we specify the range in the distribution, which is uniform by construction.	2018-08-13 13:24:45 -04:00
Jesse Haber-Kucharsky	fe58a0b207	auth: Extract random byte generation for salt	2018-08-13 13:24:45 -04:00
Jesse Haber-Kucharsky	fd60d61ebf	auth: Split out test for best supported scheme The `generate_salt` function invokes this function internally now. This change means that `generate_salt` is now thread-safe and therefore does not have to be invoked by a single thread only when starting the `password_authenticator`. This further means that `generate_salt` does not need to be part of the public interface of the module, and can be moved to the implementation file.	2018-08-13 13:24:45 -04:00
Jesse Haber-Kucharsky	adf058bd1f	auth: Rename function to use full words	2018-08-13 13:24:45 -04:00
Jesse Haber-Kucharsky	9b8cbb8542	auth: Add domain-specific exception for passwords	2018-08-13 13:24:45 -04:00
Jesse Haber-Kucharsky	dbea3f5a01	auth: Document passwords interface	2018-08-13 13:24:45 -04:00
Jesse Haber-Kucharsky	b272d622f8	auth: Move passsword stuff to its own namespace For clarity and nicer function names.	2018-08-13 13:24:45 -04:00
Jesse Haber-Kucharsky	de01aaf181	auth: Identify password hashing errors correctly See `fce10f2c6e` for reference.	2018-08-13 13:24:45 -04:00
Jesse Haber-Kucharsky	2a40bcb281	auth: Move password handling to its own files While the `password_authenticator` is a complex component with lots of dependencies, password hashing and checking itself is a process with limited logical state and dependencies, which makes it easy to isolate and test.	2018-08-13 13:24:45 -04:00
Jesse Haber-Kucharsky	03cf57db62	auth: Construct `std::random_device` instances once `std::random_device` has a lot of implementation-specific behavior, and as a result we cannot assume much about its performance characteristics. We initialize thread-specific static instances of `std::random_device` once so that we don't have the overhead of invoking the ctor during every invocation of `gensalt`.	2018-08-13 13:24:45 -04:00
Jesse Haber-Kucharsky	fce10f2c6e	auth: Don't use unsupported hashing algorithms In previous versions of Fedora, the `crypt_r` function returned `nullptr` when a requested hashing algorithm was not supported. This is consistent with the documentation of the function in its man page. As of Fedora 28, the function's behavior changes so that the encrypted text is not `nullptr` on error, but instead the string "0". The info pages for `crypt_r` clarify somewhat (and contradict the man pages): Some implementations return `NULL` on failure, and others return an _invalid_ hashed passphrase, which will begin with a `` and will not be the same as SALT. Because of this change of behavior, users running Scylla on a Fedora 28 machine which was upgraded from a previous release would not be able to authenticate: an unsupported hashing algorithm would be selected, producing encrypted text that did not match the entry in the table. With this change, unsupported algorithms are correctly detected and users should be able to continue to authenticate themselves. Fixes #3637. Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com> Message-Id: <bcd708f3ec195870fa2b0d147c8910fb63db7e0e.1533322594.git.jhaberku@scylladb.com>	2018-08-05 08:57:36 +03:00
Jesse Haber-Kucharsky	e664f9b0c6	Use finite time-outs for internal auth. queries	2018-07-31 11:38:16 -04:00
Nadav Har'El	25bd139508	cross-tree: clean up use of std::random_device() std::random_device() uses the relatively slow /dev/urandom, and we rarely if ever intend to use it directly - we normally want to use it to seed a faster random_engine (a pseudo-random number generator). In many places in the code, we first created a random_device variable, and then using it created a random_engine variable. However, this practice created the risk of a programmer accidentally using the random_device object, instead of the random_engine object, because both have the same API; This hurts performance. This risk materialized in just two places in the code, utils/uuid.cc and gms/gossiper.cc. A patch for to uuid.cc was sent previously by Pawel and is not included in this patch, and the fix for gossiper.{cc,hh} is included here. To avoid risking the same mistake in the future, this patch switches across the code to an idiom where the random_device object is not named, so cannot be accidentally used. We use the following idiom: std::default_random_engine _engine{std::random_device{}()}; Here std::random_device{}() creates the random device (/dev/urandom) and pulls a random integer from it. It then uses this seed to create the random_engine (the pseudo-random number generator). The std::random_device{} object is temporary and unnamed, and cannot be unintentionally used directly. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20180726154958.4405-1-nyh@scylladb.com>	2018-07-26 16:54:58 +01:00
Avi Kivity	187ebdbe46	auth: fix possible use of disengaged optional in has_salted_hash() untyped_result_set_row's cell data type is bytes_opt, and the get_block() accessor accesses the value assuming it's engaged (relying on the caller to call has()). has_unsalted_hash() calls get_blob() without calling has() beforehand, potentially triggering undefined behavior. Fix by using get_or() instead, which also simplifies the caller. I observed failures in Jenkins in this area. It's hard to be sure this is the root cause, since the failures triggered an internal consistency assertion in asan rather than an asan report. However, the error is hard to reproduce and the fix makes sense even if it doesn't prevent the error. See #3480 for the asan error. Fixes #3480 (hopefully). Message-Id: <20180602181919.29204-1-avi@scylladb.com>	2018-06-02 19:46:32 +01:00
Avi Kivity	a99e820bb9	query_processor: require clients to specify timeout configuration Remove implicit timeouts and replace with caller-specified timeouts. This allows removing the ambiguity about what timeout a statement is executed with, and allows removing cql_statement::execute_internal(), which mostly overrode timeouts and consistency levels. Timeout selection is now as follows: query_processor::*_internal: infinite timeout, CL=ONE query_processor::process(), execute(): user-specified consisistency level and timeout All callers were adjusted to specify an infinite timeout. This can be further adjusted later to use the "other" timeout for DCL and the read or write timeout (as needed) for authentication in the normal query path. Note that infinite timeouts don't mean that the query will hang; as soon as the failure detector decides that the node is down, RPC responses will termiante with a failure and the query will fail.	2018-05-14 09:41:06 +03:00
Jesse Haber-Kucharsky	cd0553ca6a	auth: Query custom options from the `authenticator` None of the `authenticator` implementations we have support custom options, but we should support this operation to support the relevant CQL statements.	2018-05-09 21:12:50 -04:00
Jesse Haber-Kucharsky	e149e48609	auth: Add type alias for custom auth. options	2018-05-09 21:12:47 -04:00
Tomasz Grabiec	52c61df930	Relax includes To avoid unnecessary recompilations. Message-Id: <1522168295-994-1-git-send-email-tgrabiec@scylladb.com>	2018-03-28 10:49:07 +03:00
Jesse Haber-Kucharsky	af24637565	auth: Increase delay before background tasks start I've observed failures due to "missing" the peer nodes by about 1 second. Adding 5 second to the existing delay should cover most false negative test results. Fixes #3320.	2018-03-26 00:52:55 -04:00
Jesse Haber-Kucharsky	00f7bc676d	auth: Remove ordering dependence If `auth::password_authenticator` also creates `system_auth.roles` and we fix the existence check for the default superuser in `auth::standard_role_manager` to only search for the columns that it owns (instead of the column itself), then both modules' initialization are independent of one another. Fixes #3319.	2018-03-25 22:38:11 -04:00
Jesse Haber-Kucharsky	968c61c296	auth: Don't warn on rescheduled task Apache Cassandra also prints at the `info` level. This change prevents tasks which we expect to be rescheduled from failing tests and scaring users. A good example of this importance of this change is when queries with a quorum consistency level (for the default superuser) fail because a quorum is not available. We will try again in this case, and this should not cause integration tests to fail.	2018-03-25 22:38:11 -04:00
Jesse Haber-Kucharsky	881656cea4	auth: Wait for schema agreement Some modules of `auth` create a default superuser if it does not already exist. The existence check is through a SELECT query with quorum consistency level. If the schema for the applicable tables has not yet propagated to a peer node at the time that it processes this query, then the `storage_proxy` will print an error message to the log and the query will be retried. Eventually, the schema will propagate and the default superuser will be created. However, the error message in the log causes integration tests to fail (and is somewhat annoying). Now, prior to querying for existing data, we wait for all gossip peers to have the same schema version as we do. Fixes #2852.	2018-03-25 22:38:08 -04:00
Jesse Haber-Kucharsky	6a360c2d17	auth: Grant all permissions to object creator When a table, keyspace, or role is created, the creator now is automatically granted all applicable permissions on the object. This behavior is consistent with Apache Cassandra. Fixes #3216.	2018-03-14 01:54:31 -04:00
Jesse Haber-Kucharsky	c502fe24ce	auth: Unify handling for unsupported errors Instead of some functions in `allow_all_authorizer` throwing exceptions and others being silently pass-through, we consistently return exception futures with `auth::unsupported_authorization_operation`. These errors are converted to `invalid_request_exception` in the CQL error and ignored where appropriate in the auth subsystem.	2018-03-14 01:54:28 -04:00
Jesse Haber-Kucharsky	97235445d3	auth: Fix life-time issue with parameter	2018-03-14 01:32:53 -04:00
Jesse Haber-Kucharsky	9117a689cf	auth: Fix `const` correctness This patch came about because of an important (and obvious, in hindsight) realization: instances of the authorizer, role manager, and authenticator are clients for access-control state and not the state itself. This is reflected directly in Scylla: `auth::service` is sharded across cores and this is possible because each instance queries and modifies the same global state. To give more examples, the value of an instance of `std::vector<int>` is the structure of the container and its contents. The value of `int file_descriptor` is an identifier for state maintained elsewhere. Having watched an excellent talk by Herb Sutter [1] and having read an informative blog post [2], it's clear that a member function marked `const` communicates that the observable state of the instance is not modified. Thus, the member functions of the role-manager, authenticator, and authorizer clients should not be marked `const` only if the state of the client itself is observably changed. By this principle, member functions which do not change the state of the client, but which mutate the global state the client is associated with (for example, by creating a role) are marked `const`. The `start` (and `stop`) functions of the client have the dual role of initializing (finalizing) both the local client state and the external state; they are not marked `const`. [1] https://herbsutter.com/2013/01/01/video-you-dont-know-const-and-mutable/ [2] http://talesofcpp.fusionfenix.com/post-2/episode-one-to-be-or-not-to-be-const	2018-03-14 01:32:43 -04:00
Avi Kivity	d973445a94	Merge "sstable/schema extensions" from Calle " Adds extension points to schema/sstables to enable hooking in stuff, like, say, something that modifies how sstable disk io works. (Cough, cough, encryption) Extensions are processed as property keywords in CQL. To add an extension, a "module" must register it into the extensions object on boot time. To avoid globals (and yet don't), extensions are reachable from config (and thus from db). Table/view tables already contain an extension element, so we utilize this to persist config. schema_tables tables/views from mutations now require a "context" object (currently only extensions, but abstracted for easier further changes. Because of how schemas currently operate, there is a super lame workaround to allow "schema_registry" access to config and by extension extensions. DB, upon instansiation, calls a thread local global "init" in schema_registry and registers the config. It, in turn, can then call table_from_mutations as required. Includes the (modified) patch to encapsulate compression into objects, mainly because it is nice to encapsulate, and isolate a little. " * 'calle/extensions-v5' of github.com:scylladb/seastar-dev: extensions: Small unit test sstables: Process extensions on file open sstables::types: Add optional extensions attribute to scylla metadata sstables::disk_types: Add hash and comparator(sstring) to disk_string schema_tables: Load/save extensions table cql: Add schema extensions processing to properties schema_tables: Require context object in schema load path schema_tables: Add opaque context object config_file_impl: Remove ostream operators main/init: Formalize configurables + add extensions to init call db::config: Add extensions as a config sub-object db::extensions: Configuration object to store various extensions cql3::statements::property_definitions: Use std::variant instead of any sstables: Add extension type for wrapping file io schema: Add opaque type to represent extensions sstables::compress/compress: Make compression a virtual object	2018-02-26 17:15:29 +02:00
Jesse Haber-Kucharsky	a83af20311	auth: Add alias for set of role names This shortens some type names considerably.	2018-02-14 14:15:59 -05:00
Jesse Haber-Kucharsky	39a44e3494	auth: Revoke permissions on dropped role resources Previously, when a table or keyspace was dropped, the authorizer (through a `migration_listener`) automatically dropped all permissions granted on that resource. Likewise, when a role is granted permissions and the role is dropped, all permissions granted to the role are dropped. In this change, we now treat role resources just like table and keyspace resources: if a permission is granted on a role (like "GRANT AUTHORIZE ON ROLE qa TO phil") and the "qa" role is dropped, then all permissions on the "qa" role resource are also dropped.	2018-02-14 14:15:59 -05:00
Jesse Haber-Kucharsky	e6d9d53eca	auth: Move definition to corresponding .cc file	2018-02-14 14:15:59 -05:00
Jesse Haber-Kucharsky	fbc97626c4	auth: Migrate legacy data on boot This change allows for seamless migration of the legacy users metadata to the new role-based metadata tables. This process is summarized in `docs/migrating-from-users-to-roles.md`. In general, if any nondefault metadata exists in the new tables, then no migration happens. If, in this case, legacy metadata still exists then a warning is written to the log. If no nondefault metadata exists in the new tables and the legacy tables exist, then each node will copy the data from the legacy tables to the new tables, performing transformations as necessary. An informational message is written to the log when the migration process starts, and when the process ends. During the process of copying, data is overwritten so that multiple nodes racing to migrate data do not conflict. Since Apache Cassandra's auth. schema uses the same table for managing roles and authentication information, some useful functions in `roles-metadata.hh` have been added to avoid code duplication. Because a superuser should be able to drop the legacy users tables from `system_auth` once the cluster has migrated to roles and is functioning correctly, we remove the restriction on altering anything in the "system_auth" keyspace. Individual tables in `system_auth` are still protected later in the function. When a cluster is upgrading from one that does not support roles to one that does, some nodes will be running old code which accesses old metadata and some will be running new code which access new metadata. With the help of the gossiper `feature` mechanism, clients connecting to upgraded nodes will be notified (through code in the relevant CQL statements) that modifications are not allowed until the entire cluster has upgraded.	2018-02-14 14:15:59 -05:00
Jesse Haber-Kucharsky	8be0165713	auth: Check protected resources of the role-manager A new function `auth::service::is_protected` checks the protected-resource set of all access-control modules (including the role-manager).	2018-02-14 14:15:59 -05:00
Jesse Haber-Kucharsky	f9f03bc2e1	cql3: Fix error handling for GRANT and REVOKE This change gets rid of duplicated code for checking if the grantee or revokee exist by moving this functionality to the auth. service.	2018-02-14 14:15:59 -05:00
Jesse Haber-Kucharsky	e18adbcb3e	auth: Remove unnecessary `sstring` allocation The authorizer now accepts parameters by `string_view`.	2018-02-14 14:15:59 -05:00
Jesse Haber-Kucharsky	5be16247cc	auth: Decouple authorization and role management auth: Decouple authorization and role management Access control in Scylla consists of three main modules: authentication, authorization, and role-management. Each of these modules is intended to be interchangeable with alternative implementations. The `auth::service` class composes these modules together to perform all access-control functionality, including caching. This architecture implies two main properties of the individual access-control modules: - Independence of modules. An implementation of authentication should have no dependence or knowledge of authorization or role-management, for example. - Simplicity of implementing the interface. Functionality that is common to all implementations should not have to be duplicated in each implementation. The abstract interface for a module should capture only the differences between particular implementations. Previously, the authorization interface depended on an instance of `auth::service` for certain operations, since it required aggregation over all the roles granted to a particular role or required checking if a given role had superuser. This change decouples authorization entirely from role-management: the authorizer now manages only permissions granted directly to a role, and not those inherited through other roles. When a query needs to be authorized, `auth::service::get_permissions` first uses the role manager to check if the role has superuser. Then, it aggregates calls to `auth::authorizer::authorize` for each role granted to the role (again, from the role-manager) to determine the sum-total permission set. This information is cached for future queries. This structure allows for easier error handling and management (something I hope to improve in the future for both the authorizer and authenticator interfaces), easier system testing, easier implementation of the abstract interfaces, and clearer system boundaries (so the code is easier to grok). Some authorizers, like the "TransitionalAuthorizer", grant permissions to anonymous users. Therefore, we could not unconditionally authorize an empty permission set in `auth::service` for anonymous users. To account for this, the interface of the authorizer has changed to accept an optional name in `authorize`. One additional notable change to the authorizer is the `auth::authorizer::list`: previously, the filtering happened at the CQL query layer and depended on the roles granted to the role in question. I've changed the function to simply query for all roles and I do the filtering in `auth::system` in-memory with the STL. This was necessary to allow the authorizer to be decoupled from role-management. This function is only called for LIST PERMISSIONS (so performance is not a concern), and it significantly reduces demand on the implementation. Finally, we unconditionally create a user in `cql_test_env` since authorization requires its existence.	2018-02-14 14:15:59 -05:00
Jesse Haber-Kucharsky	0ac7d9922d	auth: Add code to expand a resource family This will be useful for the next change, where it is used for refactoring LIST PERMISSIONS.	2018-02-14 14:15:59 -05:00
Jesse Haber-Kucharsky	13ba128967	auth: Change error messages to pass dtests The fixed dtests which only failed due to differences in wording and grammar for error messages are: - altering_nonexistent_user_throws_exception_test - cant_create_existing_user_test - dropping_nonexistent_user_throws_exception_test - users_cant_alter_their_superuser_status_test	2018-02-14 14:15:59 -05:00
Jesse Haber-Kucharsky	ce3be07556	auth: Move resource existence checks Previously, a "data" auth. resource knew how to check it's own existence by accessing a global variable. This patch accomplishes two things: it adds existence checking to all kinds of resources, and moves these checks outside of `auth::resource` itself and into `auth::service` (so that global variables are no longer accessed).	2018-02-14 14:15:59 -05:00
Jesse Haber-Kucharsky	cf5f6aa4c5	auth: Fix fragile variable life-times According to the Seastar convention, a parameter passed to a function taking a reference parameter must live for the duration of the execution of the returned future. When possible, variables are statically allocated. When this is not possible, we use `do_with`.	2018-02-14 14:15:59 -05:00
Jesse Haber-Kucharsky	357f3afb60	auth: Remove outdated "TODO" Authorization never happens at this level of the stack, though it formally did.	2018-02-14 14:15:59 -05:00
Jesse Haber-Kucharsky	b1d9d0e4ff	auth: Reorder authorizer args for consistency	2018-02-14 14:15:59 -05:00
Jesse Haber-Kucharsky	c1504cd4ff	auth: Pass `resource` by const ref. This has the dual benefit of not enforcing copying on implementations of the abstract interface and also limiting unnecessary copies. As usual with Seastar, we follow the convention that a reference parameter to a function is assumed valid for the duration of the `future` that is returned. `do_with` helps here. By adding some constants for root resources, we can avoid using `seastar::do_with` at some call-sites involving `resource` instances.	2018-02-14 14:15:59 -05:00
Jesse Haber-Kucharsky	45631604b0	auth: Use `string_view` for paramters	2018-02-14 14:15:59 -05:00

1 2 3 4

179 Commits