scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-25 11:00:35 +00:00

Author	SHA1	Message	Date
Kamil Braun	d8f8908d34	types: introduce user_type_impl::idx_of_field method. Each field of a user type has its index inside the type. This method allows to find it easily, which is needed in a bunch of places.	2019-10-25 10:42:58 +02:00
Kamil Braun	c77643a345	cql3: make cql3_type::_frozen protected. Add is_frozen() method. Noone modifies _frozen from the outside. Moving the field to `protected` makes it harder to introduce bugs.	2019-10-25 10:42:58 +02:00
Kamil Braun	d83ebe1092	collection_mutation: move collection_type_impl::difference to collection_mutation.hh.	2019-10-25 10:42:58 +02:00
Kamil Braun	7e3bbe548c	collection_mutation: move collection_type_impl::merge to collection_mutation.hh.	2019-10-25 10:42:58 +02:00
Kamil Braun	a41277a7cd	collection_mutation: move collection_type_impl::last_update to collection_mutation_view	2019-10-25 10:42:58 +02:00
Kamil Braun	30802f5814	collection_mutation: move collection_type_impl::is_any_live to collection_mutation_view	2019-10-25 10:42:58 +02:00
Kamil Braun	e16ba76c2e	collection_mutation: move collection_type_impl::is_empty to collection_mutation_view.	2019-10-25 10:42:58 +02:00
Kamil Braun	bbdb438d89	collection_mutation: easier (de)serialization of collection_mutation(s). `collection_type_impl::serialize_mutation_form` became `collection_mutation(_view)_description::serialize`. Previously callers had to cast their data_type down to collection_type to use serialize_mutation_form. Now it's done inside `serialize`. In the future `serialize` will be generalized to handle UDTs. `collection_type_impl::deserialize_mutation_form` became a free standing function `deserialize_collection_mutation` with similiar benefits. Actually, noone needs to call this function manually because of the next paragraph. A common pattern consisting of linearizing data inside a `collection_mutation_view` followed by calling `deserialize_mutation_form` has been abstracted out as a `with_deserialized` method inside collection_mutation_view. serialize_mutation_form_only_live was removed, because it hadn't been used anywhere.	2019-10-25 10:42:58 +02:00
Kamil Braun	e4101679e4	collection_mutation: generalize constructor of collection_mutation to abstract_type. The constructor doesn't use anything specific to collection_type_impl. In the future it will also handle non-frozen user types.	2019-10-25 10:42:58 +02:00
Kamil Braun	b1d16c1601	types: move collection_type_impl::mutation(_view) out of collection_type_impl. collection_type_impl::mutation became collection_mutation_description. collection_type_impl::mutation_view became collection_mutation_view_description. These classes now reside inside collection_mutation.hh. Additional documentation has been written for these classes. Related function implementations were moved to collection_mutation.cc. This makes it easier to generalize these classes to non-frozen UDTs in future commits. The new names (together with documentation) better describe their purpose.	2019-10-25 10:19:45 +02:00
Kamil Braun	c0d3e6c773	atomic_cell: move collection_mutation(_view) to a new file. The classes 'collection_mutation' and 'collection_mutation_view' were moved to a separate header, collection_mutation.hh. Implementations of functions that operate on these classes, including some methods of collection_type_impl, were moved to a separate compilation unit, collection_mutation.cc. This makes it easier to modify these structures in future commits in order to generalize them for non-frozen User Defined Types. Some additional documentation has been written for collection_mutation.	2019-10-25 10:19:45 +02:00
Kamil Braun	c90ea1056b	Remove mutation_partition_applier. It had been replaced by partition_builder in commit `dc290f0af7`.	2019-10-25 10:19:45 +02:00
Nadav Har'El	8bffb800e1	alternator: Use system_auth.roles for alternator authorization Merged patch series from Piotr Sarna: This series couples system_auth.roles with authorization routines in alternator. The `salted_hash` field, which is every user's hashed password, is used as a secret key for the signature generation in alternator. This series also adds related expiration verifications for alternator signatures. It also comes with more test cases and docs updates. Tests: alternator(local, remote), manual Piotr Sarna (11): alternator: add extracting key from system_auth.roles alternator: futurize verify_signature function alternator: move the api handler to a separate function alternator: use keys from system_auth.roles for authorization alternator: add key cache to authorization alternator-test: add a wrong password test alternator: verify that the signature has not expired alternator: add additional datestamp verification alternator-test: add tests for expired signatures docs: update alternator entry for authorization alternator-test: add authorization to README alternator-test/conftest.py \| 2 +- alternator-test/test_authorization.py \| 44 ++++++++- alternator-test/test_describe_endpoints.py \| 2 +- alternator/auth.hh \| 15 ++- alternator/server.hh \| 10 +- alternator/auth.cc \| 62 +++++++++++- alternator/server.cc \| 106 ++++++++++++--------- alternator-test/README.md \| 28 ++++++ docs/alternator/alternator.md \| 7 +- 9 files changed, 221 insertions(+), 55 deletions(-)	2019-10-23 20:51:08 +03:00
Tomasz Grabiec	e621db591e	Merge "Fix TTL serialization breakage" from Avi ommit `93270dd` changed gc_clock to be 64-bit, to fix the Y2038 problem. While 64-bit tombstone::deletion_time is serialized in a compatible way, TTLs (gc_clock::duration) were not. This patchset reverts TTL serialization to the 32-bit serialization format, and also allows opting-in to the 64-bit format in case a cluster was installed with the broken code. Only Scylla 3.1.0 is vulnerable. Fixes #4855 Tests: unit (dev)	2019-10-23 18:23:26 +02:00
Tomasz Grabiec	71720be4f7	Merge "storage_service: Reject nodetool cleanup when there is pending ranges" from Asias From Shlomi: 4 node cluster Node A, B, C, D (Node A: seed) cassandra-stress write n=10000000 -pop seq=1..10000000 -node <seed-node> cassandra-stress read duration=10h -pop seq=1..10000000 -node <seed-node> while read is progressing Node D: nodetool decommission Node A: nodetool status node - wait for UL Node A: nodetool cleanup (while decommission progresses) I get the error on c-s once decommission ends java.io.IOException: Operation x0 on key(s) [383633374d31504b5030]: Data returned was not validated The problem is when a node gets new ranges, e.g, the bootstrapping node, the existing nodes after a node is removed or decommissioned, nodetool cleanup will remove data within the new ranges which the node just gets from other nodes. To fix, we should reject the nodetool cleanup when there is pending ranges on that node. Note, rejecting nodetool cleanup is not a full protection because new ranges can be assigned to the node while cleanup is still in progress. However, it is a good start to reject until we have full protection solution. Refs: #5045	2019-10-23 17:45:41 +02:00
Avi Kivity	2970578677	config: add configuration option for 3.1.0 heritage clusters Scylla 3.1.0 broke the serialization format for TTLs. Later versions corrected it, but if a cluster was originally installed as 3.1.0, it will use the broken serialization forever. This configuration option allows upgrades from 3.1.0 to succeed, by enabling the broken format even for later versions.	2019-10-23 18:36:35 +03:00
Avi Kivity	bf4c319399	gc_clock, serialization: define new serialization for gc_clock::duration (aka TTLs) Scylla 3.1.0 inadvertently changed the serialization format of TTLs (internally represented as gc_clock::duration) from 32-bit to 64-bit, as part of preparation for Y2038 (which comes earlier for TTLed cells). This breaks mutations transported in a mixed cluster. To fix this, we revert back to the 32-bit format, unless we're in a 3.1.0- heritage cluster, in which case we use the 64-bit format. Overflow of a TTL is not a concern, since TTLs are capped to 20 years by the TTL layer. An assertion is added to verify this. This patch only defines a variable to indicate we're in a 3.1.0 heritage cluster, but a way to set it is left to a later patch.	2019-10-23 18:36:33 +03:00
Avi Kivity	771e028c1a	Update seastar submodule * seastar 6bcb17c964...2963970f6b (4): > Merge "IPv6 scope support and network interface impl" from Calle > noncopyable_function: do not copy uninitialized data > Merge "Move smp and smp queue out of reactor" from Asias > Consolidate posix socket implementations	2019-10-23 16:43:02 +03:00
Piotr Sarna	472e3cb4e1	alternator-test: add authorization to README The README paragraph informs about turning on authorization with: alternator-enforce-authorization: true and has a short note on how to set up the secret key for tests.	2019-10-23 15:05:39 +02:00
Piotr Sarna	280eb28324	docs: update alternator entry for authorization The document now mentions that secret keys are extracted from the system_auth.roles table.	2019-10-23 15:05:39 +02:00
Piotr Sarna	ebb0af3500	alternator-test: add tests for expired signatures The first test case ensures that expired signatures are not accepted, while the second one checks that signatures with dates that reach out too far into the future are also refused.	2019-10-23 15:05:39 +02:00
Piotr Sarna	a0a33ae4f3	alternator: add additional datestamp verification The authorization signature contains both a full obligatory date header and a shortened datestamp - an additional verification step ensures that the shortened stamp matches the full date.	2019-10-23 15:05:39 +02:00
Piotr Sarna	718cba10a1	alternator: verify that the signature has not expired AWS signatures have a 15min expiration policy. For compatibility, the same policy is applied for alternator requests. The policy also ensures that signatures expanding more than 15 minutes into the future are treated as unsafe and thus not accepted.	2019-10-23 15:05:39 +02:00
Piotr Sarna	e90c4a8130	alternator-test: add a wrong password test The additional test case submits a request as a user that is expected to exist (in the local setup), but the provided password is incorrect. It also updates test_wrong_key_access so it uses an empty string for trying to authenticate as an inexistent user - in order to cover more corner cases.	2019-10-23 15:05:39 +02:00
Piotr Sarna	524b03dea5	alternator: add key cache to authorization In order to avoid fetching keys from system_auth.roles system table on every request, a cache layer is introduced. And in order not to reinvent the wheel, the existing implementation of loading_cache with max size 1024 and a 1 minute timeout is used.	2019-10-23 15:05:39 +02:00
Piotr Sarna	6dee7737d7	alternator: use keys from system_auth.roles for authorization Instead of having a hardcoded secret key, the server now verifies an actual key extracted from system_auth.roles system table. This commit comes with a test update - instead of 'whatever':'whatever', the credentials used for a local run are 'alternator':'secret_pass', which matches the initial contents of system_auth.roles table, which acts as a key store. Fixes #5046	2019-10-23 15:05:39 +02:00
Piotr Sarna	388b492040	alternator: move the api handler to a separate function The lambda used for handling the api request has grown a little bit too large, so it's moved to a separate method. Along with it, the callbacks are now remembered inside the class itself.	2019-10-23 15:05:39 +02:00
Piotr Sarna	a93cf12668	alternator: futurize verify_signature function The verify_signature utility will later be coupled with Scylla authorization. In order to prepare for that, it is first transformed into a function that returns future<>, and it also becomes a member of class server. The reason it becoming a member function is that it will make it easier to implement a server-local key cache.	2019-10-23 15:05:39 +02:00
Piotr Sarna	dc310baa2d	alternator: add extracting key from system_auth.roles As a first step towards coupling alternator authorization with Scylla authorization, a helper function for extracting the key (salted_hash) belonging to the user is added.	2019-10-23 15:05:39 +02:00
Asias He	f876580740	storage_service: Reject nodetool cleanup when there is pending ranges From Shlomi: 4 node cluster Node A, B, C, D (Node A: seed) cassandra-stress write n=10000000 -pop seq=1..10000000 -node <seed-node> cassandra-stress read duration=10h -pop seq=1..10000000 -node <seed-node> while read is progressing Node D: nodetool decommission Node A: nodetool status node - wait for UL Node A: nodetool cleanup (while decommission progresses) I get the error on c-s once decommission ends java.io.IOException: Operation x0 on key(s) [383633374d31504b5030]: Data returned was not validated The problem is when a node gets new ranges, e.g, the bootstrapping node, the existing nodes after a node is removed or decommissioned, nodetool cleanup will remove data within the new ranges which the node just gets from other nodes. To fix, we should reject the nodetool cleanup when there is pending ranges on that node. Note, rejecting nodetool cleanup is not a full protection because new ranges can be assigned to the node while cleanup is still in progress. However, it is a good start to reject until we have full protection solution. Refs: #5045	2019-10-23 19:20:36 +08:00
Asias He	a39c8d0ed0	Revert "storage_service: remove storage_service::_is_bootstrap_mode." It will be needed by "storage_service: Reject nodetool cleanup when there is pending ranges" This reverts commit `dbca327b46`.	2019-10-23 19:20:36 +08:00
Raphael S. Carvalho	fc120a840d	compaction: dont rely on undefined behavior when making garbage collected writer Argument evaluation order is UB, so it's not guaranteed that c->make_garbage_collected_sstable_writer() is called before compaction is moved to run(). Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20191023052647.9066-1-raphaelsc@scylladb.com>	2019-10-23 11:04:51 +03:00
Tomasz Grabiec	dfac542466	Merge "extend multi-cell list & set type support" from Kostja Make it possible to compare multi-cell lists and sets serialized as maps with literal values and serialize them to network using a standard format (vector of values). This is a pre-requisite patch for column condition evaluation in light-weight transactions.	2019-10-23 07:39:57 +03:00
Nadav Har'El	774f8aa4b8	docs/debugging.md: add guide on how to debug cores Merged patch series from Botond Dénes: This series extends the existing docs/debugging.md with a detailed guide on how to debug Scylla coredumps. The intended target audience is developers who are debugging their first core, hence the level of details (hopefully enough). That said this should be just as useful for seasoned debuggers just quickly looking up some snippet they can't remember exactly. A Throubleshooting chapter is also added in this series for commonly-met problems. I decided to create this guide after myself having struggled for more than a day on just opening(!) a coredump that was produced on Ubuntu. As my main source, I used the How-to-debug-a-coredump page from the internal wiki which contains many useful information on debugging coredumps, however I found it to be missing some crucial information, as well being very terse, thus being primarily useful for experienced debuggers who can fill in the blanks. The reason I'm not extending said wiki page is that I think this information should not be hidden in some internal wiki page. Also, docs/debugging.md now seems to be a much better base for such a document. This document was started as a comprehensive debugging manual for beginners (but not just). You will notice that the information on how to debug cores from CentOS/Redhat are quite sparse. This is because I have no experience with such cores, so for now the respective chapters are just stubs. I intend to complete them in the future after having gained the necessary experience and knowledge, however those being in possession of said knowledge are more then welcome to send a patch. :) Botond Dénes (4): docs/debugging.md: demote 'Starting GDB' and 'Using GDB' docs/debugging.md: fix formatting issues docs/debugging.md: add 'Debugging coredumps' subchapter docs/debugging.md: add 'Throubleshooting' subchapter docs/debugging.md \| 240 +++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 228 insertions(+), 12 deletions(-)	2019-10-23 07:39:57 +03:00
Rafael Ávila de Espíndola	b3372be679	install-dependencies: Add Lua Add lua as a dependency in preparation for UDF. This is the first patch since it has to go in before to allow for a frozen toolchain update. Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> [avi: update frozen toolchain image] Message-Id: <20191018231442.11864-2-espindola@scylladb.com>	2019-10-23 07:39:57 +03:00
Konstantin Osipov	a30c08e04e	lwt: support for multi-cell set & list value serialization	2019-10-22 17:40:42 +03:00
Piotr Jastrzebski	eb8ae06ced	cdc: Return db_context::builder by reference from it's with_* functions. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2019-10-22 17:13:43 +03:00
Konstantin Osipov	605755e3f6	lwt: support for multi-cell map & list comparison with literal values Multi-cell lists and maps may be stored in different formats: as sorted vectors of pairs of values, when retreived from storage, or as sorted vectors of values, when created from parser literals or supplied as parameter values. Implement a specialized compare for use when receiver and paramter representation don't match. Add helpers.	2019-10-22 17:07:33 +03:00
Raphael S. Carvalho	3b6583990d	sstables: Fix sluggish backlog controller with incremental compaction The problem is that backlog tracker is not being updated properly after incremental compaction. When replacing sstables earlier, we tell backlog tracker that we're done with exhausted sstables[1], but we don't tell it about the new, sealed sstables created that will replace the exhausted ones. [1]: exhausted sstable is one that can be replaced earlier by compaction. We need to notify backlog tracker about every sstable replacement which was triggered by incremental compaction. Otherwise, backlog for a table that enables incremental compaction will be lower than it actually should. That's because new sstables being tracked as partial decrease the backlog, whereas the exhausted ones increase it. The formula for a table's backlog is basically: backlog(sstable set + compacting(1) - partial(2)) (1) compacting includes all compaction's input sstables, but the exhausted ones are removed from it (correct behavior). (2) partial includes all compaction's output sstables, but the ones that replaced the exhausted sstables aren't removed from it (incorrect behavior). This problem is fixed by making backlog track fully aware of the early replacement, not only the exhausted sstables, but also the new sstables that replaced the exhausted ones. The new sstables need to be moved inside the tracker from partial state to the regular one. Fixes #5157. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20191016002838.23811-1-raphaelsc@scylladb.com>	2019-10-22 16:19:57 +03:00
Vladimir Davydov	6c6689f779	cql: refactor statement accounting Rather than passing a pointer to a cql_stats member corresponding to the statement type, pass a reference to a cql_stats object and use statement_type, which is already stored in modification_statement, for determining which counter to increment. This will allow us to account conditional statements, which will have a separate set of counters, right in modification_statement::execute() - all we'll need to do is add the new counters and bump them in case execute_with_condition is called. While we are at it, remove extra inclusions from statement_type.hh so as not to introduce any extra dependencies for cql_stats.hh users. Message-Id: <20191022092258.GC21588@esperanza>	2019-10-22 12:39:14 +03:00
Nadav Har'El	51fc6c7a8e	make static_row optional to reduce memory footprint Merged patch series from Avi Kivity: The static row can be rare: many tables don't have them, and tables that do will often have mutations without them (if the static row is rarely updated, it may be present in the cache and in readers, but absent in memtable mutations). However, it always consumes ~100 bytes of memory, even if it not present, due to row's overhead. Change it to be optional by allocating it as an external object rather than inlined into mutation_partition. This adds overhead when the static row is present (17 bytes for the reference, back reference, and lsa allocator overhead). perf_simple_query appears to marginally (2%) faster. Footprint is reduced by ~9% for a cache entry, 12% in memtables. More details are provided in the patch commitlog. Tests: unit (debug) Avi Kivity (4): managed_ref: add get() accessor managed_ref: add external_memory_usage() mutation_partition: introduce lazy_row mutation_partition: make static_row optional to reduce memory footprint cell_locking.hh \| 2 +- converting_mutation_partition_applier.hh \| 4 +- mutation_partition.hh \| 284 ++++++++++++++++++++++- partition_builder.hh \| 4 +- utils/managed_ref.hh \| 12 + flat_mutation_reader.cc \| 2 +- memtable.cc \| 2 +- mutation_partition.cc \| 45 +++- mutation_partition_serializer.cc \| 2 +- partition_version.cc \| 4 +- tests/multishard_mutation_query_test.cc \| 2 +- tests/mutation_source_test.cc \| 2 +- tests/mutation_test.cc \| 12 +- tests/sstable_mutation_test.cc \| 10 +- 14 files changed, 355 insertions(+), 32 deletions(-)	2019-10-22 12:25:15 +03:00
Avi Kivity	bc03b0fd47	Merge "Some refactoring of node startup code" from Kamil " The node startup code (in particular the functions storage_service::prepare_to_join and storage_service::join_token_ring) is complicated and hard to understand. This patch set aims to simplify it at least a bit by removing some dead code, moving code around so it's easier to understand and adding some comments that explain what the code does. I did it to help me prepare for implementing generation and gossiping of CDC streams. " * 'bootstrap-refactors' of https://github.com/kbr-/scylla: storage_service: more comments in join_token_ring db: remove system_keyspace::update_local_tokens db: improve documentation for update_tokens and get_saved_tokens in system_keyspace storage_service: remove storage_service::_is_bootstrap_mode. storage_service: simplify storage_service::bootstrap method storage_service: fix typo in handle_state_moving storage_service: remove unnecessary use of stringstream storage_service: remove redundant call to update_tokens during join_token_ring storage_service: remove storage_service::set_tokens method. storage_service: remove is_survey_mode storage_service::handle_state_normal: tokens_to_update* -> owned_tokens storage_service::handle_state_normal: remove local_tokens_to_remove db::system_keyspace::update_tokens: take tokens by const ref db::system_keyspace::prepare_tokens: make static, take tokens by const ref token_metadata::update_normal_tokens: take tokens by const ref	2019-10-22 12:11:11 +03:00
Asias He	0a52ecb6df	gossip: Fix max generation drift measure Assume n1 and n2 in a cluster with generation number g1, g2. The cluster runs for more than 1 year (MAX_GENERATION_DIFFERENCE). When n1 reboots with generation g1' which is time based, n2 will see g1' > g2 + MAX_GENERATION_DIFFERENCE and reject n1's gossip update. To fix, check the generation drift with generation value this node would get if this node were restarted. This is a backport of CASSANDRA-10969. Fixes #5164	2019-10-21 20:20:55 +02:00
Kamil Braun	f1c26bf5c9	storage_service: more comments in join_token_ring Explain why a call to update_normal_tokens is needed.	2019-10-21 11:11:03 +02:00
Kamil Braun	fb1e35f032	db: remove system_keyspace::update_local_tokens That was dead code.	2019-10-21 11:11:03 +02:00
Kamil Braun	1b0c8e5d99	db: improve documentation for update_tokens and get_saved_tokens in system_keyspace	2019-10-21 11:11:03 +02:00
Kamil Braun	dbca327b46	storage_service: remove storage_service::_is_bootstrap_mode. The flag did nothing. It was used in one place to check if there's a bug, but it can easily by proven by reading the code that the check would never pass.	2019-10-21 11:11:03 +02:00
Kamil Braun	b757a19f84	storage_service: simplify storage_service::bootstrap method The storage_service::bootstrap method took a parameter: tokens to bootstrap with. However, this method is only called in one place (join_token_ring) with only one parameter: _bootstrap_tokens. It doesn't make sense to call this method anywhere else with any other parameter. This commit also adds a comment explaining what the method does and moves it into the private section of storage_service.	2019-10-21 11:11:03 +02:00
Kamil Braun	84b41bd89b	storage_service: fix typo in handle_state_moving	2019-10-21 11:11:03 +02:00
Kamil Braun	2ff4f9b8f4	storage_service: remove unnecessary use of stringstream	2019-10-21 11:11:03 +02:00

1 2 3 4 5 ...

19914 Commits