scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-02 21:17:01 +00:00

Author	SHA1	Message	Date
Avi Kivity	ebaeefa02b	Merge seatar upstream (seastar namespace) - introcduced "seastarx.hh" header, which does a "using namespace seastar"; - 'net' namespace conflicts with seastar::net, renamed to 'netw'. - 'transport' namespace conflicts with seastar::transport, renamed to cql_transport. - "logger" global variables now conflict with logger global type, renamed to xlogger. - other minor changes	2017-05-21 12:26:15 +03:00
Avi Kivity	c8cb3d6ff5	Merge "Materialized views: bug fixes and unit tests" from Duarte "This series fixes bugs related to materialized views, most pertaining to column filtering in the where clause." * 'materialized-views/bug-fixes/v1' of https://github.com/duarten/scylla: tests/view_schema_test: Add more test cases tests/cql_assertions: Add assertion for row set equality single_column_relation: Correctly print IN relation statement_restrictions: Allow filtering regular columns for views statement_restrictions: Relax clustering restrictions for views statement_restrictions: Relax partition restrictions for views cql3/statements: Prevent setting default ttl on view cql3/restrictions: Complete implementation of is_satisfied_by() db/view: Re-implement clustering_prefix_matches() db/view: Re-implement partition_key_matches() db/view: Generate regular tombstone for base deletions db/view: Consider cell liveness when generating updates db/view: Don't generate view updates for static rows	2017-05-20 13:52:56 +03:00
Paweł Dziepak	c560cf9d9d	Merge "fixes and improvements in the permissions cache implementation" from Vlad "There are numerous issues in the current implementation of permissions cache starting from the logical errors and bugs and ending with the suboptimal implementation described in the issue #2262." * 'permissions_cache_fixes-v4' of github.com:scylladb/seastar-dev: utils::loading_cache: avoid the reads storm when the key is not in the cache utils::loading_cache: cleanup utils::loading_cache: align the constrains in the constructor with the parameters description utils::loading_cache: refresh in the background auth::auth: add operator<<() for a permission_cache key auth::auth::permissions_cache: use the values from the configuration - don't try to be smart db::config: define a saner default value for permissions_validity_in_ms	2017-05-18 13:33:05 +01:00
Vlad Zolotarov	ea1cfabe28	db::config: define a saner default value for permissions_validity_in_ms It makes little sense to have the same value for permissions_update_interval_in_ms and permissions_validity_in_ms. This may cause the values to be invalidated only because some minor delays in the timer scheduling. It makes a lot more sense to make the permissions_update_interval_in_ms value smaller than permissions_validity_in_ms. This way we would minimize the chances of "false invalidation" due to some small delays in the timer scheduling. In addition, 2s seems to be a too small value for permissions_validity_in_ms since our default read_request_timeout_in_ms is 5s. This means that a single system_auth read failure would guarantee that the following queries are going to read system_auth data in the foreground. Setting it to 10s would allow a second read attempt before we enforce the foreground read. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-05-17 12:03:56 -04:00
Calle Wilund	29b20d410a	schema_tables: Remove "class" attribute from strategy options Not 100% proper, but in line with how we still store the info. Ensures (helps at least) to keep schema loaded from tables and schema from builder comparable. Fixes schema_changes_test error. Message-Id: <1495030581-2138-2-git-send-email-calle@scylladb.com>	2017-05-17 17:56:11 +03:00
Duarte Nunes	983af595e9	database: Read existing base mutations When generating updates for a materialized view we need to read the existing base row, to be able to determine the primary key of the view row the new base update will supplant, in case the view includes a base non-primary key column in its own primary key. That old view row will be tombstoned or updated, if it exists, depending on the difference between the new base row and the existing one, if any. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-05-17 10:33:19 +02:00
Duarte Nunes	8a77bfe35b	db/view: Calculate clustering ranges for MV read-before-write query Introduce the calculate_affected_clustering_ranges() function to calculate the smallest subject of affected clustering ranges that we need to query for. The update_requires_read_before_write() function checks whether a view is potentially affected by the base update. The patch also cleans up the may_be_affected_by() function. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-05-17 10:33:19 +02:00
Duarte Nunes	ec681060a8	db/view: Replace entry if cells don't match If a base table regular columns is part of the view's pk, and if that column changes, we should replace the entry, by deleting the row(s) with the old value and inserting a new one. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-05-17 10:33:19 +02:00
Duarte Nunes	bad0edb23b	db/view: Re-implement clustering_prefix_matches() This patch implements clustering_prefix_matches() in terms of abstract_restriction::is_satisfied_by() instead of ranges, which supports filtering just a subset of the clustering columns. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-05-17 10:33:19 +02:00
Duarte Nunes	b0d1ea76a2	db/view: Re-implement partition_key_matches() This patch implements partition_key_matches() in terms of abstract_restriction::is_satisfied_by() instead of ranges, which supports filtering just a component of a compound partition key. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-05-17 10:33:19 +02:00
Duarte Nunes	38be85a21d	db/view: Generate regular tombstone for base deletions Instead of shadowable tombstones, which only apply to updates. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-05-17 10:33:19 +02:00
Duarte Nunes	1fd8b8e723	db/view: Consider cell liveness when generating updates This patch ensures we take into account the liveness of the base's regular column in the view's pk when generating view updates. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-05-17 10:33:19 +02:00
Duarte Nunes	c421da6825	db/view: Don't generate view updates for static rows Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-05-17 10:33:19 +02:00
Duarte Nunes	f41a5e554d	view_info: Store base regular col in the view's PK as column_id This patch stores the base_non_pk_column_in_view column as column_id, which is more convenient, and it also stores a two-level optional to encode both lazy initialization and the absence of such a column. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-05-17 10:33:18 +02:00
Calle Wilund	c8f92536c1	legacy_schema_migrator: Actually truncate legacy schema tables on finish	2017-05-10 16:44:48 +00:00
Calle Wilund	6c8b5fc09d	schema_tables: Use v3 schema tables and formats Switches system/schema_* for system_schema/*, updates schema/schema builder and uses to hold/expect v3 style info (i.e. types & dropped).	2017-05-10 16:44:48 +00:00
Calle Wilund	f9b83e299e	type_parser: Origin expects empty string -> bytes_type	2017-05-10 16:44:48 +00:00
Calle Wilund	0e6ae8dec2	schema: rename column accessors to be in line with origin More pointedly: Expose columns as is (currently all_columns_in_select_order), expose name->column mapping more appropriately named. Renaming like this is not strictly neccesary, but there is a point to trying to keep nomenclature similar-ish with origin, esp. when select order column need to become filtered (spoiler alert).	2017-05-10 16:44:48 +00:00
Calle Wilund	b1c5447ab5	cql3_type_parser: Resolve from cql3 names/expressions Cassandra 3 uses cql names for column/field types, thus we need to parse these out-of-line, and resolve more akin to the cql parser. Also wrap building user types similarly to origin, using a "builder" wrapper, and usage graph resolving.	2017-05-10 16:44:47 +00:00
Calle Wilund	3964055d98	legacy_schema_migrator: Add schema table converter Initial. Does not actually write anything.	2017-05-10 16:44:47 +00:00
Calle Wilund	8066efb710	system_keyspace: Add getter/setter for built index status Even though we have none.	2017-05-09 13:48:55 +00:00
Calle Wilund	061ef16562	system_tables/schema_tables: Remove special format case of "execute_cql" Having a varadic parameter being used in implicit sprint is not very readable + makes it less intuitive when suddenly system keyspace becomes more than one -> multiple sprints in the chain -> more confusion or more execution paths. Its not that horrible with some spread out sprint:s	2017-05-09 13:48:55 +00:00
Calle Wilund	27fdc5cfef	schema_tables/system_tables: Add v3 tables to "ALL" and handle in init I.e. deal with more than one keyspace in system_keyspace::make	2017-05-09 13:48:55 +00:00
Calle Wilund	815aa8ba9f	schema_tables: Add schema definitions for v3 tables	2017-05-09 13:48:55 +00:00
Calle Wilund	4378dca6e1	schema_tables: Hide/abstract schema keyspace name	2017-05-09 13:48:55 +00:00
Calle Wilund	2fb36e3bf8	system_keyspace: Add query overloads with named keyspace	2017-05-09 13:48:55 +00:00
Calle Wilund	32909d4c84	system_keyspace: Add v3+legacy schema definitions	2017-05-09 13:48:55 +00:00
Avi Kivity	8c5c5d3004	Merge "CQL front-end for secondary indices" from Pekka "This patch series adds CQL front-end support for secondary indices. You can now execute CREATE INDEX and DROP INDEX statements, which will update the newly added "Indexes" system table. However, the indexes are not actually backed up by anything nor are they available for CQL queries. The feature is hidden behind a new cluster feature flag and enabled only with the "--experimental" flag." * 'penberg/cql-2i/v2' of github.com:cloudius-systems/seastar-dev: (34 commits) schema: Kill index_type enum schema: Kill index_info class cql3/statements/create_index_statement: Use database::existing_index_names() in validation cql3/statements: Use secondary index manager in alter_table_statement class index: Add secondary_index_manager thrift/handler: Use index_metadata db/schema_tables: Index persistence schema: Add all_indices() to schema class schema: Remove add_default_index_names() from schema_builder class db/schema_tables: Add system table for indices cql3/Cgl.g: DROP INDEX cql3/statements: Add drop_index_statement class database: Add find_indexed_table() to database class cql3: Return change event from announce_migration() cql3/statements: Multiple index targets for CREATE INDEX cql3/statements: Use index_metadata in create_index_statement class cql3/statements: Use feature flag in create_index_statement class service/storage_service: Add feature flag for secondary indices database: Add get_available_index_name() to database class schema: Add get_default_index_name() to index_metadata class ...	2017-05-08 17:04:40 +03:00
Pekka Enberg	11474ed4c6	db/schema_tables: Index persistence	2017-05-08 10:03:28 +03:00
Avi Kivity	9e67bd5aac	Merge " Add partial range deletion support" from Duarte "This series introduces partial support for range deletions. This allows deletion operations such as delete from cf where p=1 and c > 0 and c <= 3. This series only adds support for single-column range restrictions. We enforce that both range bounds be specified, because we can't represent infinite bounds in the current sstable format. Such bounds are represented as a prefix with no components, with the bound_kind informing whether they are a bottom of top bound. We're currently unable to serialize an infinite bound in such a way that it would be correctly interpreted by Cassandra 2.2.x. A serialized bound is a composite with a (<length><value><EOC>)+ format. While we could technically represent the bottom bound, the top bound, if written as a single component with 0 bytes in size and some EOC, would always sort before other values. The same would happen if represented as an empty (no components) composite, because in Cassandra 2.2.x those always have EOC = NONE. This limitation should stay in place until we can properly represent range tombstones in the storage format." * 'range-deletions/v2' of https://github.com/duarten/scylla: mutation: Set cell using clustering_key_prefix mutation_partition: Harmonize apply_delete overloads prefix_compound_view_wrapper: Add is_full and is_empty functions tests/cql_query_test: Add range deletion tests cql3: Partially support ranged deletions single_column_primary_key_restrictions: Implement has_bound() modification_statement: Use statement_restrictions for where clause statement_restrictions: Expose primary key restrictions to_string: Add missing include	2017-05-07 19:27:09 +03:00
Avi Kivity	5278e1a14d	commitlog: handle noexcept conflict between unlink and function object ::unlink is declared as noexcept, but the function object it is passed into is not. gcc 7 warns, so wrap ::unlink in a lambda to make it happy.	2017-05-05 17:02:30 +03:00
Avi Kivity	d542cdddf6	thrift: change generated code namespace org::apache::cassandra (the generated namespace name) gets confused with apache::cassandra (the thrift runtime library namespace), either due to changes in gcc 7 or in thrift 0.10. Either way, the problem is fixed by changing the generated namespace to plain cassandra.	2017-05-05 05:26:20 +03:00
Duarte Nunes	9e88b60ef5	mutation: Set cell using clustering_key_prefix Change the clustering key argument in mutation::set_cell from exploded_clustering_prefix to clustering_key_prefix, which allows for some overall code simplification and fewer copies. This mostly affects the cql3 layer. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-05-04 15:59:50 +02:00
Pekka Enberg	8b943c0ceb	db/schema_tables: Add system table for indices	2017-05-04 14:59:12 +03:00
Paweł Dziepak	24f4dcf9e4	db: make virtual dirty soft limit configurable Message-Id: <20170428150005.28454-1-pdziepak@scylladb.com>	2017-04-30 19:17:22 +03:00
Vlad Zolotarov	d5b76d5198	type_parser: catch exceptions by reference and not by value Found by PVS-Studio static analyzer: Type slicing. An exception should be caught by reference rather than by value. Fixes #2288 Reported-by: Phillip Khandeliants Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-04-26 15:12:15 -04:00
Duarte Nunes	4e693383f7	mutation_partion: Use row_tombstone This patch replaces the current row tombstone representation by a row_tombstone. The intent of the patch is thus to reify the idea of shadowable tombstones, that up until now we considered all materialized view row tombstones to be. We need to distinguish shadowable from non-shadowable row tombstones to support scenarios such as, when inserting to a table with a materialzied view: 1. insert into base (p, v1, v2) values (3, 1, 3) using timestamp 1 2. delete from base using timestamp 2 where p = 3 3. insert into base (p, v1) values (3, 1) using timestamp 3 These should yield a view row where v2 is definitely null, but with the current implementation, v2 will pop back with its value v2=3@TS=1, even though its dead in the base row. This is because the row tombstone inserted at 2) is a shadowable one. This patch only addresses the memory representation of such row_tombstones. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-04-25 11:46:33 +02:00
Avi Kivity	944047f039	read_repair_decision: fix operator<<(std::ostream&, ...) Argument-dependent lookup requires that the operator be declared in the same namespace as the class; move it there. While at it, de-static it, it only causes bloat.	2017-04-22 21:09:41 +03:00
Vlad Zolotarov	c26799c9b0	config: enforce the 'stop' value for commit_failure_policy/disk_failure_policy Fixes #2246 Signed-off-by: Vlad Zolotarov <vladz@scylladb.com> Message-Id: <1491246164-26612-1-git-send-email-vladz@scylladb.com>	2017-04-04 16:46:36 +03:00
Tomasz Grabiec	2c775bbb6e	config: Allow specifying source when setting value So that is_set() will be true for that option. Needed in tests which set some config options in higher layer and then lower layers detects if option was set or not before applying its default.	2017-03-28 18:34:55 +02:00
Calle Wilund	b12b65db92	commitlog/replayer: Bugfix: minimum rp broken, and cl reader offset too The previous fix removed the additional insertion of "min rp" per source shard based on whether we had processed existing CF:s or not (i.e. if a CF does not exist as sstable at all, we must tag it as zero-rp, and make whole shard for it start at same zero. This is bad in itself, because it can cause data loss. It does not cause crashing however. But it did uncover another, old old lingering bug, namely the commitlog reader initiating its stream wrongly when reading from an actual offset (i.e. not processing the whole file). We opened the file stream from the file offset, then tried to read the file header and magic number from there -> boom, error. Also, rp-to-file mapping was potentially suboptimal due to using bucket iterator instead of actual range. I.e. three fixes: * Reinstate min position guarding for unencoutered CF:s * Fix stream creating in CL reader * Fix segment map iterator use. v2: * Fix typo Message-Id: <1490611637-12220-1-git-send-email-calle@scylladb.com>	2017-03-28 10:32:28 +02:00
Calle Wilund	c3a510a08d	commitlog_replayer: Do proper const-loopup of min positions for shards Fixes #2173 Per-shard min positions can be unset if we never collected any sstable/truncation info for it, yet replay segments of that id. Wrap the lookups to handle "missing data -> default", which should have been there in the first place. Message-Id: <1490185101-12482-1-git-send-email-calle@scylladb.com>	2017-03-22 17:57:09 +02:00
Duarte Nunes	be12a2bf0a	db/schema_tables: Atomically publish base and view changes This patch ensures that the schema merging atomically publishes schema changes. In particular, it ensures that when a base schema and a subset of its views are modified together (i.e., upon an alter table or alter type statement), then they are published together as well, without any deferring in-between. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-03-15 16:35:07 +01:00
Duarte Nunes	bfb8a3c172	materialized views: Replace db::view::view class The write path uses a base schema at a particular version, and we want it to use the materialized views at the corresponding version. To achieve this, we need to map the state currently in db::view::view to a particular schema version, which this patch does by introducing the view_info class to hold the state previously in db::view::view, and by having a view schema directly point to it. The changes in the patch are thus: 1) Introduce view_info to hold the extra view state; 2) Point to the view_info from the schema; 3) Make the functions in the now stateless db::view::view non-member; 4) Remove the db::view::view class. All changes are structural and don't affect current behavior. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-03-15 15:50:05 +01:00
Calle Wilund	078589c508	commitlog_replayer: Make replay parallel per shard Fixes #2098 Replay previously did all segments in parallel on shard 0, which caused heavy memory load. To reduce this and spread footprint across shards, instead do X segments per shard, sequential per shard. v2: * Fixed whitespace errors Message-Id: <1489503382-830-1-git-send-email-calle@scylladb.com>	2017-03-15 13:07:17 +02:00
Duarte Nunes	16bcf8d085	db/schema_tables: Avoid copying keyspace name This patch changes a lambda argument type so the keyspace name is passed by reference instead of copying it, in read_schema_for_keyspaces(). Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170309213134.10331-1-duarte@scylladb.com>	2017-03-10 11:03:56 +02:00
Gleb Natapov	d34f3a0440	batchlog: introduce batch_size_fail_threshold_in_kb option Add batch_size_fail_threshold_in_kb to prevent huge batch from been applied and causing troubles. Also do not warn or fail if only one partition is affected. Fixes: #2128 Message-Id: <20170309111247.GE8197@scylladb.com>	2017-03-09 12:20:17 +01:00
Paweł Dziepak	6db6d25f66	Merge "Avoid loosing changes to keyspace parameters of system_auth and tracing keyspaces" form Tomek "If a node is bootstrapped with auto_boostrap disabled, it will not wait for schema sync before creating global keyspaces for auth and tracing. When such schema changes are then reconciled with schema on other nodes, they may overwrite changes made by the user before the node was started, because they will have higher timestamp. To prevent that, let's use minimum timestamp so that default schema always looses with manual modifications. This is what Cassandra does. Fixes #2129." * tag 'tgrabiec/prevent-keyspace-metadata-loss-v1' of github.com:scylladb/seastar-dev: db: Create default auth and tracing keyspaces using lowest timestamp migration_manager: Append actual keyspace mutations with schema notifications	2017-03-08 10:59:47 +00:00
Tomasz Grabiec	06d4ad1bdd	migration_manager: Append actual keyspace mutations with schema notifications There is a workaround for notification race, which attaches keyspace mutations to other schema changes in case the target node missed the keyspace creation. Currently that generated keyspace mutations on the spot instead of using the ones stored in schema tables. Those mutations would have current timestamp, as if the keyspace has been just modified. This is problematic because this may generate an overwrite of keyspace parameters with newer timestamp but with stale values, if the node is not up to date with keyspace metadata. That's especially the case when booting up a node without enabling auto_bootstrap. In such case the node will not wait for schema sync before creating auth tables. Such table creation will attach potentially out of date mutations for keyspace metadata, which may overwrite changes made to keyspace paramteters made earlier in the cluster. Refs #2129.	2017-03-07 19:19:15 +01:00
Avi Kivity	439b38f5ab	Merge "Improvements to counter implementation" from Paweł "This series adds various optimisations to counter implementation (nothing extreme, mostly just avoiding unnecessary operations) as well as some missing features such as tracing and dropping timed out queries. Performance was tested using: perf-simple-query -c4 --counters --duration 60 The following results are medians. before after diff write 18640.41 33156.81 +77.9% read 58002.32 62733.93 +8.2%" * tag 'pdziepak/optimise-counters/v3' of github.com:cloudius-systems/seastar-dev: (30 commits) cell_locker: add metrics for lock acquisition storage_proxy: count counter updates for which the node was a leader storage_proxy: use counter-specific timeout for writes storage_proxy: transform counter timeouts to mutation_write_timeout_exception db: avoid allocations in do_apply_counter_update() tests/counters: add test for apply reversability counters: attempt to apply in place atomic_cell: add COUNTER_IN_PLACE_REVERT flag counters: add equality operators counters: implement decrement operators for shard_iterator counters: allow using both views and mutable_views atomic_cell: introduce atomic_cell_mutable_view managed_bytes: add cast to mutable_view bytes: add bytes_mutable_view utils: introduce mutable_view db: add more tracing events for counter writes db: propagate tracing state for counter writes tests/cell_locker: add test for timing out lock acquisition counter_cell_locker: allow setting timeouts db: propagate timeout for counter writes ...	2017-03-07 11:48:13 +02:00

1 2 3 4 5 ...

844 Commits