scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-31 12:06:44 +00:00

Author	SHA1	Message	Date
Asias He	b89ced4635	streaming: Do not open rpc stream connection if reader has no data We can use the reader::peek() to check if the reader contains any data. If not, do not open the rpc stream connection. It helps to reduce the port usage. Refs: #4943	2019-10-08 10:31:02 +02:00
Konstantin Osipov	94006d77b1	lwt: add cas_contention_timeout_in_ms to config Make the default conform to the origin. Message-Id: <20191006154532.54856-3-kostja@scylladb.com>	2019-10-08 00:02:35 +02:00
Konstantin Osipov	383e17162a	lwt: implement query_options::check_serial_consistency() Both in a single-statement transaction and in a batch we expect that serial consistency is provided. Move the check to query_options class and make it available for reuse. Keep get_serial_consistency() around for use in transport/server.cc. Message-Id: <20191006154532.54856-2-kostja@scylladb.com>	2019-10-08 00:02:35 +02:00
Piotr Sarna	36a1905e98	storage_proxy: handle unstarted write cancelling When another node is reported to be down, view updates queued for it are cancelled, but some of them may already be initiated. Right now, cancelling such a write resulted in an exception, but on conceptual level it's not really an exception, since this behaviour is expected. Previous version of this patch was based on introducing a special exception type that was later handled specially, but it's not clear if it's a good direction. Instead, this patch simply makes this path non-exceptional, as was originally done by Nadav in the first version of the series that introduced handling unstarted write cancellations. Additionally, a message containing the information that a write is cancelled is logged with debug level.	2019-10-07 16:55:36 +03:00
Vladimir Davydov	e8bcb34ed4	api: drop /storage_proxy/metrics/cas_read/condition_not_met There's no such metric in Cassandra (although Cassadra's docs mistakenly say it exists). Having it would make no sense anyway so let's drop it. Message-Id: <b4f7a6ad278235c443cb8ea740bfa6399f8e4ee1.1570434332.git.vdavydov@scylladb.com>	2019-10-07 16:54:39 +03:00
Piotr Sarna	5ab134abef	alternator-test: update HTTPS section of README README.md has 3 fixes applied: - s/alternator_tls_port/alternator_https_port - conf directory is mentioned more explicitly - it now correctly states that the self-signed certificate warning is explicitly ignored in tests Message-Id: <e5767f7dbea260852fc2fa9b613e1bebf490cc78.1570444085.git.sarna@scylladb.com>	2019-10-07 14:51:16 +03:00
Avi Kivity	8ed6f94a16	Merge "Fix handling of schema alters and eviction in cache" from Tomasz " Fixes #5134, Eviction concurrent with preempted partition entry update after memtable flush may allow stale data to be populated into cache. Fixes #5135, Cache reads may miss some writes if schema alter followed by a read happened concurrently with preempted partition entry update. Fixes #5127, Cache populating read concurrent with schema alter may use the wrong schema version to interpret sstable data. Fixes #5128, Reads of multi-row partitions concurrent with memtable flush may fail or cause a node crash after schema alter. " * tag 'fix-cache-issues-with-schema-alter-and-eviction-v2' of github.com:tgrabiec/scylla: tests: row_cache: Introduce test_alter_then_preempted_update_then_memtable_read tests: row_cache_stress_test: Verify all entries are evictable at the end tests: row_cache_stress_test: Exercise single-partition reads tests: row_cache_stress_test: Add periodic schema alters tests: memtable_snapshot_source: Allow changing the schema tests: simple_schema: Prepare for schema altering row_cache: Record upgraded schema in memtable entries during update memtable: Extract memtable_entry::upgrade_schema() row_cache, mvcc: Prevent locked snapshots from being evicted row_cache: Make evict() not use invalidate_unwrapped() mvcc: Introduce partition_snapshot::touch() row_cache, mvcc: Do not upgrade schema of entries which are being updated row_cache: Use the correct schema version to populate the partition entry delegating_reader: Optimize fill_buffer() row_cache, memtable: Use upgrade_schema() flat_mutation_reader: Introduce upgrade_schema()	2019-10-07 14:43:36 +03:00
Nadav Har'El	f2f0f5eb0f	alternator: add https support Merged patch series from Piotr Sarna: This series adds HTTPS support for Alternator. The series comes with --https option added to alternator-test, which makes the test harness run all the tests with HTTPS instead of HTTP. All the tests pass, albeit with security warnings that a self-signed x509 certificate was used and it should not be trusted. Fixes #5042 Refs scylladb/seastar#685 Patches: docs: update alternator entry on HTTPS alternator-test: suppress the "Unverified HTTPS request" warning alternator-test: add HTTPS info to README.md alternator-test: add HTTPS to test_describe_endpoints alternator-test: add --https parameter alternator: add HTTPS support config: add alternator HTTPS port	2019-10-07 12:38:20 +03:00
Avi Kivity	969113f0c9	Update seastar submodule * seastar c21a7557f9...1f68be436f (6): > scheduling: Add per scheduling group data support > build: Include dpdk as a single object in libseastar.a > sharded: fix foreign_ptr's move assignment > build: Fix DPDK libraries linking in pkg-config file > http server: https using tls support > Make output_stream blurb Doxygen	2019-10-07 12:18:49 +03:00
Nadav Har'El	754add1688	alternator: fix Expected's BEGINS_WITH error handling The BEGINS_WITH condition in conditional updates (via Expected) requires that the given operand be either a string or a binary. Any other operand should result in a validation exception - not a failed condition as we generate now. This patch fixes the test for this case so it will succeed against Amazon DynamoDB (before this patch it fails - this failure was masked by a typo before commit `332ffa77ea`). The patch then fixes our code to handle this case correctly. Note that BEGINS_WITH handling of wrong types is now asymmetrical: A bad type in the operand is now handled differently from a bad type in the attribute's value. We add another check to the test to verify that this is the case. Fixes #5141 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20191006080553.4135-1-nyh@scylladb.com>	2019-10-06 17:16:55 +03:00
Tomasz Grabiec	020a537ade	tests: row_cache: Introduce test_alter_then_preempted_update_then_memtable_read	2019-10-04 11:38:13 +02:00
Tomasz Grabiec	ebedefac29	tests: row_cache_stress_test: Verify all entries are evictable at the end	2019-10-04 11:38:12 +02:00
Tomasz Grabiec	1b95f5bf60	tests: row_cache_stress_test: Exercise single-partition reads make_single_key_reader() currently doesn't actually create single-partition readers because it doesn't set mutation_reader::forwarding::no when it creates individual readers. The readers will default to mutation_reader::forwarding::yes and actually create scanning readers in preparation for fast-forwarding across partitions. Fix by passing mutation_reader::forwarding::no.	2019-10-04 11:38:12 +02:00
Tomasz Grabiec	81dd17da4e	tests: row_cache_stress_test: Add periodic schema alters Reproduces #5127.	2019-10-03 22:03:29 +02:00
Tomasz Grabiec	2fc144e1a8	tests: memtable_snapshot_source: Allow changing the schema	2019-10-03 22:03:29 +02:00
Tomasz Grabiec	22dde90dba	tests: simple_schema: Prepare for schema altering Currently, methods of simple_schema assume that table's schema doesn't change. Accessors like get_value() assume that rows were generated using simple_schema::_s. Because if that, the column_definition& for the "v" column is cached in the instance. That column_definiion& cannot be used to access objects created with a different schema version. To allow using simple_schema after schema changes, column_definition& caching is now tagged with the table schema version of origin. Methods which access schema-dependent objects, like get_value(), are now accepting schema& corresponding to the objects. Also, it's now possible to tell simple_schema to use a different schema version in its generator methods.	2019-10-03 22:03:29 +02:00
Tomasz Grabiec	e6afc89735	row_cache: Record upgraded schema in memtable entries during update Cache update may defer in the middle of moving of partition entry from a flushed memtable to the cache. If the schema was changed since the entry was written, it upgrades the schema of the partition_entry first but doesn't update the schema_ptr in memtable_entry. The entry is removed from the memtable afterward. If a memtable reader encounters such an entry, it will try to upgrade it assuming it's still at the old schema. That is undefined behavior in general, which may include: - read failures due to bad_alloc, if fixed-size cells are interpreted as variable-sized cells, and we misinterpret a value for a huge size - wrong read results - node crash This doesn't result in a permanent corruption, restarting the node should help. It's the more likely to happen the more rows there are in a partition. It's unlikely to happen with single-row partitions. Introduced in `70c7277`. Fixes #5128.	2019-10-03 22:03:29 +02:00
Tomasz Grabiec	ea461a3884	memtable: Extract memtable_entry::upgrade_schema()	2019-10-03 22:03:29 +02:00
Tomasz Grabiec	90d6c0b9a2	row_cache, mvcc: Prevent locked snapshots from being evicted If the whole partition entry is evicted while being updated from the memtable, a subsequent read may populate the partition using the old version of data if it attempts to do it before cache update advances past that partition. Partial eviction is not affected because populating reads will notice that there is a newer snapshot corresponding to the updater. This can happen only in OOM situations where the whole cache gets evicted. Affects only tables with multi-row partitions, which are the only ones that can experience the update of partition entry being preempted. Introduced in `70c7277`. Fixes #5134.	2019-10-03 22:03:29 +02:00
Tomasz Grabiec	57a93513bd	row_cache: Make evict() not use invalidate_unwrapped() invalidate_unwrapped() calls cache_entry::evict(), which cannot be called concurrently with cache update. invalidate() serializes it properly by calling do_update(), but evict() doesn't. The purpose of evict() is to stress eviction in tests, which can happen concurrently with cache update. Switch it to use memory reclaimer, so that it's both correct and more realistic. evict() is used only in tests.	2019-10-03 22:03:28 +02:00
Tomasz Grabiec	c88a4e8f47	mvcc: Introduce partition_snapshot::touch()	2019-10-03 22:03:28 +02:00
Tomasz Grabiec	25e2f87a37	row_cache, mvcc: Do not upgrade schema of entries which are being updated When a read enters a partition entry in the cache, it first upgrades it to the current schema of the cache. The same happens when an entry is updated after a memtable flush. Upgrading the entry is currently performed by squashing all versions and replacing them with a single upgraded version. That has a side effect of detaching all snapshots from the partition entry. Partition entry update on memtable flush is writing into a snapshot. If that snapshot is detached by a schema upgrade, the entry will be missing writes from the memtable which fall into continuous ranges in that entry which have not yet been updated. This can happen only if the update of the entry is preempted and the schema was altered during that, and a read hit that partition before the update went past it. Affects only tables with multi-row partitions, which are the only ones that can experience the update of partition entry being preempted. The problem is fixed by locking updated entries and not upgrading schema of locked entries. cache_entry::read() is prepared for this, and will upgrade on-the-fly to the cache's schema. Fixes #5135	2019-10-03 22:03:28 +02:00
Tomasz Grabiec	0675088818	row_cache: Use the correct schema version to populate the partition entry The sstable reader which populates the partition entry in the cache is using the schema of the partition entry snapshot, which will be the schema of the cache at the time the partition was entered. If there was a schema change after the cache reader entered the partition but before it created the sstable reader, the cache populating reader will interpret sstable fragments using the wrong schema version. That is more likely if partitions have many rows, and the front of the partition is populated. With single-row partitions that's unlikely to happen. That is undefined behavior in general, which may include: - read failures due to bad_alloc, if fixed-size cells are interpreted as variable-sized cells, and we misinterpret a value for a huge size - wrong read results - node crash This doesn't result in a permanent corruption, restarting the node should help. Fixes #5127.	2019-10-03 22:03:28 +02:00
Tomasz Grabiec	10992a8846	delegating_reader: Optimize fill_buffer() Use move_buffer_content_to() which is faster than fill_buffer_from() because it doesn't involve popping and pushing the fragments across buffers. We save on size estimation costs.	2019-10-03 22:03:28 +02:00
Piotr Sarna	07ac3ea632	docs: update alternator entry on HTTPS The HTTPS entry is updated - it's now supported, but still misses the same features as HTTP - CRC headers, etc.	2019-10-03 19:10:30 +02:00
Piotr Sarna	b63077a8dc	alternator-test: suppress the "Unverified HTTPS request" warning Running with --https and a self-signed certificate results in a flood of expected warnings, that the connection is not to be trusted. These warnings are silenced, as users runing a local test with --https usually use self-signed certificates.	2019-10-03 19:10:30 +02:00
Piotr Sarna	e65fd490da	alternator-test: add HTTPS info to README.md A short paragraph about running tests with `--https` and configuring the cluster to work correctly with this parameter is added to README.md.	2019-10-03 19:10:30 +02:00
Piotr Sarna	0d28d7f528	alternator-test: add HTTPS to test_describe_endpoints The test_describe_endpoints test spawns another client connection to the cluster, so it needs to be HTTPS-aware in order to work properly with --https parameter.	2019-10-03 19:10:30 +02:00
Piotr Sarna	9fd77ed81d	alternator-test: add --https parameter Running with --https parameter will result in sending the requests via HTTPS instead of HTTP. By default, port 8043 is used for a local cluster. Before running pytest --https, make sure that Scylla was properly configured to initialize a HTTPS alternator server by providing the alternator_tls_port parameter. The HTTPS-based connection runs with verification disabled, otherwise it would not work with self-signed certificates, which are useful for tests.	2019-10-03 19:10:30 +02:00
Piotr Sarna	e1b0537149	alternator: add HTTPS support By providing a server based on a TLS socket, it's now possible to serve HTTPS requests in alternator. The HTTPS server is enabled by setting its port in scylla.yaml: alternator_tls_port=XXXX. Alternator TLS relies on the existing TLS configuration, which is provided by certificate, keyfile, truststore, priority_string options. Fixes #5042	2019-10-03 19:10:30 +02:00
Piotr Sarna	b42eb8b80a	config: add alternator HTTPS port The config variable will be used to set up a TLS-based server for serving alternator HTTPS requests.	2019-10-03 19:10:29 +02:00
Nadav Har'El	9d4e71bbc6	alternator-test: fix misleading xfail message The test test_update_expression_function_nesting() fails because DynamoDB don't allow an expression like list_append(list_append(:val1, :val2), :val3) but Alternator doesn't check for this (and supports this expression). The "xfail" message was outdated, suggesting that the test fails because the "SET" expression isn't supported - but it is. So replace the message by a more accurate one. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20190915104708.30471-1-nyh@scylladb.com>	2019-10-03 18:45:03 +03:00
Nadav Har'El	9747019e7b	alternator: implement additional Expected operators Merged patch set from Dejan Mircevski implementing some of the missing operators for Expected: NE, IN, NULL and NOT_NULL. Patches: alternator: Factor out Expected operand checks alternator: Implement NOT_NULL operator in Expected alternator: Implement NULL operator in Expected alternator: Fix expected_1_null testcase alternator: Implement IN operator in Expected alternator: Implement NE operator in Expected alternator: Factor out common code in Expected	2019-10-03 18:12:38 +03:00
Konstantin Osipov	25ffd36d21	lwt: prepare the expression tree for IF condition evaluation Frozen empty lists/map/sets are not equal to null value, whil multi-cell empty lists/map/sets are equal to null values. Return a NULL value for an empty multi-cell set or list if we know the receiver is not frozen - this makes it easy to compare the parameter with the receiver. Add a test case for inserting an empty list or set - the result is indistinguishable from NULL value. Message-Id: <20191003092157.92294-2-kostja@scylladb.com>	2019-10-03 14:56:25 +02:00
Avi Kivity	3cb081eb84	Merge " hinted handoff: fix races during shutdown and draining" from Vlad " Fix races that may lead to use-after-free events and file system level exceptions during shutdown and drain. The root cause of use-after-free events in question is that space_watchdog blocks on end_point_hints_manager::file_update_mutex() and we need to make sure this mutex is alive as long as it's accessed even if the corresponding end_point_hints_manager instance is destroyed in the context of manager::drain_for(). File system exceptions may occur when space_watchdog attempts to scan a directory while it's being deleted from the drain_for() context. In case of such an exception new hints generation is going to be blocked - including for materialized views, till the next space_watchdog round (in 1s). Issues that are fixed are #4685 and #4836. Tested as follows: 1) Patched the code in order to trigger the race with (a lot) higher probability and running slightly modified hinted handoff replace dtest with a debug binary for 100 times. Side effect of this testing was discovering of #4836. 2) Using the same patch as above tested that there are no crashes and nodes survive stop/start sequences (they were not without this series) in the context of all hinted handoff dtests. Ran the whole set of tests with dev binary for 10 times. " * 'hinted_handoff_race_between_drain_for_and_space_watchdog_no_global_lock-v2' of https://github.com/vladzcloudius/scylla: hinted handoff: fix a race on a directory removal between space_watchdog and drain_for() hinted handoff: make taking file_update_mutex safe db::hints::manager::drain_for(): fix alignment db::hints::manager: serialize calls to drain_for() db::hints: cosmetics: identation and missing method qualifier	2019-10-03 14:38:00 +03:00
Tomasz Grabiec	aad1307b14	row_cache, memtable: Use upgrade_schema()	2019-10-03 13:28:33 +02:00
Tomasz Grabiec	3177732b35	flat_mutation_reader: Introduce upgrade_schema()	2019-10-03 13:28:33 +02:00
Asias He	a9b95f5f01	repair: Fix tracker::start and tracker::done in case of error The operation after gate.enter() in tracker::start() can fail and throw, we should call gate.leave() in such case to avoid unbalanced enter and leave calls. tracker::done() has similar issue too. Fix it by removing the gate enter and leave logic in tracker start and done. A helper tracker::run() is introduced to take care of the gate and repair status. In addition, the error log is improved. It now logs exceptions on all shards in the summary. e.g., [shard 0] repair - repair id 1 failed: std::runtime_error ({shard 0: std::runtime_error (error0), shard 1: std::runtime_error (error1)}) Fixes #5074	2019-10-03 13:33:02 +03:00
Botond Dénes	00b432b61d	querier_cache: correctly account entries evicted on insertion in the population Currently, the population stat is not increased for entries that are evicted immediately on insert, however the code that does the eviction still decreases the population stat, leading to an imbalance and in some cases the underflow of the population stat. To fix, unconditionally increase the population stat upon inserting an entry, regardless of whether it is immediately evicted or not. Fixes: #5123 Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20191001153215.82997-1-bdenes@scylladb.com>	2019-10-03 11:49:44 +03:00
Dejan Mircevski	ac98385d04	alternator: Factor out Expected operand checks Put all AttributeValuelist size verification under verify_operand_count(), rather than have some cases invoke verify_operand_count() while others verify it in check_*() functions. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2019-10-02 17:11:58 -04:00
Dejan Mircevski	de18b3240b	alternator:Implement NOT_NULL operator in Expected Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2019-10-02 16:23:59 -04:00
Dejan Mircevski	75960639a4	alternator: Implement NULL operator in Expected Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2019-10-02 16:19:14 -04:00
Dejan Mircevski	e4fd5f3ef0	alternator: Fix expected_1_null testcase Testcase "For NULL, AttributeValueList must be empty" accidentally used NOT_NULL instead of NULL. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2019-10-02 16:19:14 -04:00
Dejan Mircevski	b7ac510581	alternator: Implement IN operator in Expected Add check_IN() and a switch case that invokes it. Reactivate IN tests. Add a testcase for non-scalar attribute values. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2019-10-02 16:17:38 -04:00
Dejan Mircevski	56efa55a06	alternator: Implement NE operator in Expected Recognize "NE" as a new operator type, add check_NE() function, invoke it in verify_expected_one(), and reactivate NE tests. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2019-10-02 14:47:13 -04:00
Dejan Mircevski	af0462d127	alternator: Factor out common code in Expected Operand-count verification will be repeated a lot as more operators are implemented, so factor it out into verify_operand_count(). Also move `got` null checks to check_* functions, which reduces duplication at call sites. Signed-off-by: Dejan Mircevski <dejan@scylladb.com>	2019-10-02 14:36:57 -04:00
Konstantin Osipov	e8c13efb41	lwt: move mutation hashers to mutation.hh Prepare mutation hashers for reuse in CAS implementation. Message-Id: <20190930202409.40561-2-kostja@scylladb.com>	2019-10-01 19:49:31 +02:00
Konstantin Osipov	6cde985946	lwt: remove code that no longer servers as a reference Remove ifdef'ed Java code, since LWT implementation is based on the current state of the origin. Message-Id: <20190930201022.40240-2-kostja@scylladb.com>	2019-10-01 19:46:15 +02:00
Konstantin Osipov	4d214b624b	lwt: ensure enum_set::of is constexpr. This allows using it to initialize const static members. Message-Id: <20190930200530.40063-2-kostja@scylladb.com>	2019-10-01 19:45:56 +02:00
Tomasz Grabiec	3b9bf9d448	Merge "storage_proxy: replace variadic futures with structs" from Avi Seastar variadic futures are deprecated, so replace with structs to avoid nasty deprecation warnings.	2019-10-01 19:32:55 +02:00

1 2 3 4 5 ...

19767 Commits