scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-23 10:00:35 +00:00

Author	SHA1	Message	Date
Jan Ciolek	eee8f750cc	cql3: preserve binary_operator.order in search_and_replace There was a bug in `expr::search_and_replace`. It doesn't preserve the `order` field of binary_operator. `order` field is used to mark relations created using the SCYLLA_CLUSTERING_BOUND. It is a CQL feature used for internal queries inside Scylla. It means that we should handle the restriction as a raw clustering bound, not as an expression in the CQL language. Losing the SCYLLA_CLUSTERING_BOUND marker could cause issues, the database could end up selecting the wrong clustering ranges. Fixes: #13055 Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com> Closes #13056 (cherry picked from commit `aa604bd935`)	2023-03-09 12:52:39 +02:00
Botond Dénes	8d5206e6c6	sstables/sstable: validate_checksums(): force-check EOF EOF is only guarateed to be set if one tried to read past the end of the file. So when checking for EOF, also try to read some more. This should force the EOF flag into a correct value. We can then check that the read yielded 0 bytes. This should ensure that `validate_checksums()` will not falsely declare the validation to have failed. Fixes: #11190 Closes #12696 (cherry picked from commit `693c22595a`)	2023-03-09 12:30:44 +02:00
Anna Stuchlik	cfa40402f4	doc: Update the documentation landing page This commit makes the following changes to the docs landing page: - Adds the ScyllaDB enterprise docs as one of three tiles. - Modifies the three tiles to reflect the three flavors of ScyllaDB. - Moves the "New to ScyllaDB? Start here!" under the page title. - Renames "Our Products" to "Other Products" to list the products other than ScyllaDB itself. In addtition, the boxes are enlarged from to large-4 to look better. The major purpose of this commit is to expose the ScyllaDB documentation. docs: fix the link (cherry picked from commit `27bb8c2302`) Closes #13086	2023-03-06 14:18:15 +02:00
Botond Dénes	2d170e51cf	Merge 'doc: specify the versions where Alternator TTL is no longer experimental' from Anna Stuchlik This PR adds a note to the Alternator TTL section to specify in which Open Source and Enterprise versions the feature was promoted from experimental to non-experimental. The challenge here is that OSS and Enterprise are (still) documented together, but they're not in sync in promoting the TTL feature: it's still experimental in 5.1 (released) but no longer experimental in 2022.2 (to be released soon). We can take one of the following approaches: a) Merge this PR with master and ask the 2022.2 users to refer to master. b) Merge this PR with master and then backport to branch-5.1. If we choose this approach, it is necessary to backport https://github.com/scylladb/scylladb/pull/11997 beforehand to avoid conflicts. I'd opt for a) because it makes more sense from the OSS perspective and helps us avoid mess and backporting. Closes #12295 * github.com:scylladb/scylladb: doc: fix the version in the comment on removing the note doc: specify the versions where Alternator TTL is no longer experimental (cherry picked from commit `d5dee43be7`)	2023-03-02 12:09:16 +02:00
Anna Stuchlik	860e79e4b1	doc: fixes https://github.com/scylladb/scylladb/issues/12954 , adds the minimal version from which the 2021.1-to-2022.1 upgrade is supported for Ubuntu, Debian, and image Closes #12974 (cherry picked from commit `91b611209f`)	2023-02-28 13:02:05 +02:00
Anna Mikhlin	908a82bea0	release: prepare for 5.2.0-rc2 scylla-5.2.0-rc2	2023-02-28 10:13:06 +02:00
Gleb Natapov	39158f55d0	lwt: do not destroy capture in upgrade_if_needed lambda since the lambda is used more then once If on the first call the capture is destroyed the second call may crash. Fixes: #12958 Message-Id: <Y/sks73Sb35F+PsC@scylladb.com> (cherry picked from commit `1ce7ad1ee6`)	2023-02-27 14:19:37 +02:00
Raphael S. Carvalho	22c1685b3d	sstables: Temporarily disable loading of first and last position metadata It's known that reading large cells in reverse cause large allocations. Source: https://github.com/scylladb/scylladb/issues/11642 The loading is preliminary work for splitting large partitions into fragments composing a run and then be able to later read such a run in an efficiency way using the position metadata. The splitting is not turned on yet, anywhere. Therefore, we can temporarily disable the loading, as a way to avoid regressions in stable versions. Large allocations can cause stalls due to foreground memory eviction kicking in. The default values for position metadata say that first and last position include all clustering rows, but they aren't used anywhere other than by sstable_run to determine if a run is disjoint at clustering level, but given that no splitting is done yet, it does not really matter. Unit tests relying on position metadata were adjusted to enable the loading, such that they can still pass. Fixes #11642. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #12979 (cherry picked from commit `d73ffe7220`)	2023-02-27 08:58:34 +02:00
Botond Dénes	9ba6fc73f1	mutation_compactor: only pass consumed range-tombstone-change to validator Currently all consumed range tombstone changes are unconditionally forwarded to the validator. Even if they are shadowed by a higher level tombstone and/or purgable. This can result in a situation where a range tombstone change was seen by the validator but not passed to the consumer. The validator expects the range tombstone change to be closed by end-of-partition but the end fragment won't come as the tombstone was dropped, resulting in a false-positive validation failure. Fix by only passing tombstones to the validator, that are actually passed to the consumer too. Fixes: #12575 Closes #12578 (cherry picked from commit `e2c9cdb576`)	2023-02-23 22:52:47 +02:00
Botond Dénes	f2e2c0127a	types: unserialize_value for multiprecision_int,bool: don't read uninitialized memory Check the first fragment before dereferencing it, the fragment might be empty, in which case move to the next one. Found by running range scan tests with random schema and random data. Fixes: #12821 Fixes: #12823 Fixes: #12708 Closes #12824 (cherry picked from commit `ef548e654d`)	2023-02-23 22:38:03 +02:00
Gleb Natapov	363ea87f51	raft: abort applier fiber when a state machine aborts After `5badf20c7a` applier fiber does not stop after it gets abort error from a state machine which may trigger an assertion because previous batch is not applied. Fix it. Fixes #12863 (cherry picked from commit `9bdef9158e`)	2023-02-23 14:12:12 +02:00
Kefu Chai	c49fd6f176	tools/schema_loader: do not return ref to a local variable we should never return a reference to local variable. so in this change, a reference to a static variable is returned instead. this should address following warning from Clang 17: ``` /home/kefu/dev/scylladb/tools/schema_loader.cc:146:16: error: returning reference to local temporary object [-Werror,-Wreturn-stack-address] return {}; ^~ ``` Fixes #12875 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12876 (cherry picked from commit `6eab8720c4`)	2023-02-22 22:02:43 +02:00
Takuya ASADA	3114589a30	scylla_coredump_setup: fix coredump timeout settings We currently configure only TimeoutStartSec, but probably it's not enough to prevent coredump timeout, since TimeoutStartSec is maximum waiting time for service startup, and there is another directive to specify maximum service running time (RuntimeMaxSec). To fix the problem, we should specify RunTimeMaxSec and TimeoutSec (it configures both TimeoutStartSec and TimeoutStopSec). Fixes #5430 Closes #12757 (cherry picked from commit `bf27fdeaa2`)	2023-02-19 21:13:36 +02:00
Anna Stuchlik	34f68a4c0f	doc: related https://github.com/scylladb/scylladb/issues/12658 , fix the service name in the upgrade guide from 2022.1 to 2022.2 Closes #12698 (cherry picked from commit `826f67a298`)	2023-02-17 12:17:48 +02:00
Botond Dénes	b336e11f59	Merge 'doc: fix the service name from "scylla-enterprise-server" "to "scylla-server"' from Anna Stuchlik Related https://github.com/scylladb/scylladb/issues/12658. This issue fixes the bug in the upgrade guides for the released versions. Closes #12679 * github.com:scylladb/scylladb: doc: fix the service name in the upgrade guide for patch releases versions 2022 doc: fix the service name in the upgrade guide from 2021.1 to 2022.1 (cherry picked from commit `325246ab2a`)	2023-02-17 12:16:52 +02:00
Anna Stuchlik	9ef73d7e36	doc: fixes https://github.com/scylladb/scylladb/issues/12754 , document the metric update in 5.2 Closes #12891 (cherry picked from commit `bcca706ff5`)	2023-02-17 12:16:13 +02:00
Botond Dénes	8700a72b4c	Merge 'Backport compaction-backlog-tracker fixes to branch-5.2' from Raphael "Raph" Carvalho Both patches are important to fix inefficiencies when updating the backlog tracker, which can manifest as a reactor stall, on a special event like schema change. No conflicts when backporting. Regression since `1d9f53c881`, which is present in branch 5.1 onwards. Closes #12851 * github.com:scylladb/scylladb: compaction: Fix inefficiency when updating LCS backlog tracker table: Fix quadratic behavior when inserting sstables into tracker on schema change	2023-02-15 07:22:25 +02:00
Raphael S. Carvalho	886dd3e1d2	compaction: Fix inefficiency when updating LCS backlog tracker LCS backlog tracker uses STCS tracker for L0. Turns out LCS tracker is calling STCS tracker's replace_sstables() with empty arguments even when higher levels (> 0) only had sstables replaced. This unnecessary call to STCS tracker will cause it to recompute the L0 backlog, yielding the same value as before. As LCS has a fragment size of 0.16G on higher levels, we may be updating the tracker multiple times during incremental compaction, which operates on SSTables on higher levels. Inefficiency is fixed by only updating the STCS tracker if any L0 sstable is being added or removed from the table. This may be fixing a quadratic behavior during boot or refresh, as new sstables are loaded one by one. Higher levels have a substantial higher number of sstables, therefore updating STCS tracker only when level 0 changes, reduces significantly the number of times L0 backlog is recomputed. Refs #12499. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #12676 (cherry picked from commit `1b2140e416`) Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-02-14 12:14:27 -03:00
Raphael S. Carvalho	f565f3de06	table: Fix quadratic behavior when inserting sstables into tracker on schema change Each time backlog tracker is informed about a new or old sstable, it will recompute the static part of backlog which complexity is proportional to the total number of sstables. On schema change, we're calling backlog_tracker::replace_sstables() for each existing sstable, therefore it produces O(N ^ 2) complexity. Fixes #12499. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #12593 (cherry picked from commit `87ee547120`) Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-02-14 12:14:21 -03:00
Anna Stuchlik	76ff6d981c	doc: related https://github.com/scylladb/scylladb/issues/12754 , add the requirement to upgrade Monitoring to version 4.3 Closes #12784 (cherry picked from commit `c7778dd30b`)	2023-02-10 10:28:35 +02:00
Botond Dénes	f924f59055	Merge 'Backport test.py improvements to 5.2' from Kamil Braun Backport the following improvements for test.py efficiency and user experience: - https://github.com/scylladb/scylladb/pull/12542 - https://github.com/scylladb/scylladb/pull/12560 - https://github.com/scylladb/scylladb/pull/12564 - https://github.com/scylladb/scylladb/pull/12563 - https://github.com/scylladb/scylladb/pull/12588 - https://github.com/scylladb/scylladb/pull/12613 - https://github.com/scylladb/scylladb/pull/12569 - https://github.com/scylladb/scylladb/pull/12612 - https://github.com/scylladb/scylladb/pull/12549 - https://github.com/scylladb/scylladb/pull/12678 Fixes #12617 Closes #12770 * github.com:scylladb/scylladb: test/pylib: put UNIX-domain socket in /tmp Merge 'test/pylib: scylla_cluster: ensure there's space in the cluster pool when running a sequence of tests' from Kamil Braun Merge 'test.py: manual cluster pool handling for Python suite' from Alecco Merge 'test.py: handle broken clusters for Python suite' from Alecco test/pylib: scylla_cluster: don't leak server if stopping it fails Merge 'test/pylib: scylla_cluster: improve server startup check' from Kamil Braun test/pylib: scylla_cluster: return error details from test framework endpoints test/pylib: scylla_cluster: release cluster IPs when stopping ScyllaClusterManager test/pylib: scylla_cluster: mark cluster as dirty if it fails to boot test: disable commitlog O_DSYNC, preallocation	2023-02-08 15:09:09 +02:00
Nadav Har'El	d5cef05810	test/pylib: put UNIX-domain socket in /tmp The "cluster manager" used by the topology test suite uses a UNIX-domain socket to communicate between the cluster manager and the individual tests. The socket is currently located in the test directory but there is a problem: In Linux the length of the path used as a UNIX-domain socket address is limited to just a little over 100 bytes. In Jenkins run, the test directory names are very long, and we sometimes go over this length limit and the result is that test.py fails creating this socket. In this patch we simply put the socket in /tmp instead of the test directory. We only need to do this change in one place - the cluster manager, as it already passes the socket path to the individual tests (using the "--manager-api" option). Tested by cloning Scylla in a very long directory name. A test like ./test.py --mode=dev test_concurrent_schema fails before this patch, and passes with it. Fixes #12622 Closes #12678 (cherry picked from commit `681a066923`)	2023-02-07 17:12:14 +01:00
Nadav Har'El	e0f4e99e9b	Merge 'test/pylib: scylla_cluster: ensure there's space in the cluster pool when running a sequence of tests' from Kamil Braun `ScyllaClusterManager` is used to run a sequence of test cases from a single test file. Between two consecutive tests, if the previous test left the cluster 'dirty', meaning the cluster cannot be reused, it would free up space in the pool (using `steal`), stop the cluster, then get a new cluster from the pool. Between the `steal` and the `get`, a concurrent test run (with its own instance of `ScyllaClusterManager` would start, because there was free space in the pool. This resulted in undesirable behavior when we ran tests with `--repeat X` for a large `X`: we would start with e.g. 4 concurrent runs of a test file, because the pool size was 4. As soon as one of the runs freed up space in the pool, we would start another concurrent run. Soon we'd end up with 8 concurrent runs. Then 16 concurrent runs. And so on. We would have a large number of concurrent runs, even though the original 4 runs didn't finish yet. All of these concurrent runs would compete waiting on the pool, and waiting for space in the pool would take longer and longer (the duration is linear w.r.t number of concurrent competing runs). Tests would then time out because they would have to wait too long. Fix that by using the new `replace_dirty` function introduced to the pool. This function frees up space by returning a dirty cluster and then immediately takes it away to be used for a new cluster. Thanks to this, we will only have at most as many concurrent runs as the pool size. For example with --repeat 8 and pool size 4, we would run 4 concurrent runs and start the 5th run only when one of the original 4 runs finishes, then the 6th run when a second run finishes and so on. The fix is preceded by a refactor that replaces `steal` with `put(is_dirty=True)` and a `destroy` function passed to the pool (now the pool is responsible for stopping the cluster and releasing its IPs). Fixes #11757 Closes #12549 * github.com:scylladb/scylladb: test/pylib: scylla_cluster: ensure there's space in the cluster pool when running a sequence of tests test/pylib: pool: introduce `replace_dirty` test/pylib: pool: replace `steal` with `put(is_dirty=True)` (cherry picked from commit `132af20057`)	2023-02-07 17:08:17 +01:00
Kamil Braun	6795715011	Merge 'test.py: manual cluster pool handling for Python suite' from Alecco From reviews of https://github.com/scylladb/scylladb/pull/12569, avoid using `async with` and access the `Pool` of clusters with `get()`/`put()`. Closes #12612 * github.com:scylladb/scylladb: test.py: manual cluster handling for PythonSuite test.py: stop cluster if PythonSuite fails to start test.py: minor fix for failed PythonSuite test (cherry picked from commit `5bc7f0732e`)	2023-02-07 17:07:43 +01:00
Nadav Har'El	aa9e91c376	Merge 'test.py: handle broken clusters for Python suite' from Alecco If the after test check fails (is_after_test_ok is False), discard the cluster and raise exception so context manager (pool) does not recycle it. Ignore exception re-raised by the context manager. Fixes #12360 Closes #12569 * github.com:scylladb/scylladb: test.py: handle broken clusters for Python suite test.py: Pool discard method (cherry picked from commit `54f174a1f4`)	2023-02-07 17:07:36 +01:00
Kamil Braun	ddfb9ebab2	test/pylib: scylla_cluster: don't leak server if stopping it fails `ScyllaCluster.server_stop` had this piece of code: ``` server = self.running.pop(server_id) if gracefully: await server.stop_gracefully() else: await server.stop() self.stopped[server_id] = server ``` We observed `stop_gracefully()` failing due to a server hanging during shutdown. We then ended up in a state where neither `self.running` nor `self.stopped` had this server. Later, when releasing the cluster and its IPs, we would release that server's IP - but the server might have still been running (all servers in `self.running` are killed before releasing IPs, but this one wasn't in `self.running`). Fix this by popping the server from `self.running` only after `stop_gracefully`/`stop` finishes. Make an analogous fix in `server_start`: put `server` into `self.running` before we actually start it. If the start fails, the server will be considered "running" even though it isn't necessarily, but that is OK - if it isn't running, then trying to stop it later will simply do nothing; if it is actually running, we will kill it (which we should do) when clearing after the cluster; and we don't leak it. Closes #12613 (cherry picked from commit `a0ff33e777`)	2023-02-07 17:05:20 +01:00
Nadav Har'El	d58a3e4d16	Merge 'test/pylib: scylla_cluster: improve server startup check' from Kamil Braun Don't use a range scan, which is very inefficient, to perform a query for checking CQL availability. Improve logging when waiting for server startup times out. Provide details about the failure: whether we managed to obtain the Host ID of the server and whether we managed to establish a CQL connection. Closes #12588 * github.com:scylladb/scylladb: test/pylib: scylla_cluster: better logging for timeout on server startup test/pylib: scylla_cluster: use less expensive query to check for CQL availability (cherry picked from commit `ccc2c6b5dd`)	2023-02-07 17:05:02 +01:00
Kamil Braun	2ebac52d2d	test/pylib: scylla_cluster: return error details from test framework endpoints If an endpoint handler throws an exception, the details of the exception are not returned to the client. Normally this is desirable so that information is not leaked, but in this test framework we do want to return the details to the client so it can log a useful error message. Do it by wrapping every handler into a catch clause that returns the exception message. Also modify a bit how HTTPErrors are rendered so it's easier to discern the actual body of the error from other details (such as the params used to make the request etc.) Before: ``` E test.pylib.rest_client.HTTPError: HTTP error 500: 500 Internal Server Error E E Server got itself in trouble, params None, json None, uri http+unix://api/cluster/before-test/test_stuff ``` After: ``` E test.pylib.rest_client.HTTPError: HTTP error 500, uri: http+unix://api/cluster/before-test/test_stuff, params: None, json: None, body: E Failed to start server at host 127.155.129.1. E Check the log files: E /home/kbraun/dev/scylladb/testlog/test.py.dev.log E /home/kbraun/dev/scylladb/testlog/dev/scylla-1.log ``` Closes #12563 (cherry picked from commit `2f84e820fd`)	2023-02-07 17:04:37 +01:00
Kamil Braun	b536614913	test/pylib: scylla_cluster: release cluster IPs when stopping ScyllaClusterManager When we obtained a new cluster for a test case after the previous test case left a dirty cluster, we would release the old cluster's used IP addresses (`_before_test` function). However, we would not release the last cluster's IP after the last test case. We would run out of IPs with sufficiently many test files or `--repeat` runs. Fix this. Also reorder the operations a bit: stop the cluster (and release its IPs) before freeing up space in the cluster pool (i.e. call `self.cluster.stop()` before `self.clusters.steal()`). This reduces concurrency a bit - fewer Scyllas running at the same time, which is good (the pool size gives a limit on the desired max number of concurrently running clusters). Killing a cluster is quick so it won't make a significant difference for the next guy waiting on the pool. Closes #12564 (cherry picked from commit `3ed3966f13`)	2023-02-07 17:04:19 +01:00
Kamil Braun	85df0fd2b1	test/pylib: scylla_cluster: mark cluster as dirty if it fails to boot If a cluster fails to boot, it saves the exception in `self.start_exception` variable; the exception will be rethrown when a test tries to start using this cluster. As explained in `before_test`: ``` def before_test(self, name) -> None: """Check that the cluster is ready for a test. If there was a start error, throw it here - the server is running when it's added to the pool, which can't be attributed to any specific test, throwing it here would stop a specific test.""" ``` It's arguable whether we should blame some random test for a failure that it didn't cause, but nevertheless, there's a problem here: the `start_exception` will be rethrown and the test will fail, but then the cluster will be simply returned to the pool and the next test will attempt to use it... and so on. Prevent this by marking the cluster as dirty the first time we rethrow the exception. Closes #12560 (cherry picked from commit `147dd73996`)	2023-02-07 17:03:56 +01:00
Avi Kivity	cdf9fe7023	test: disable commitlog O_DSYNC, preallocation Commitlog O_DSYNC is intended to make Raft and schema writes durable in the face of power loss. To make O_DSYNC performant, we preallocate the commitlog segments, so that the commitlog writes only change file data and not file metadata (which would require the filesystem to commit its own log). However, in tests, this causes each ScyllaDB instance to write 384MB of commitlog segments. This overloads the disks and slows everything down. Fix this by disabling O_DSYNC (and therefore preallocation) during the tests. They can't survive power loss, and run with --unsafe-bypass-fsync anyway. Closes #12542 (cherry picked from commit `9029b8dead`)	2023-02-07 17:02:59 +01:00
Beni Peled	8ff4717fd0	release: prepare for 5.2.0-rc1 scylla-5.2.0-rc1	2023-02-06 22:13:53 +02:00
Kamil Braun	291b1f6e7f	service/raft: raft_group0: prevent double abort There was a small chance that we called `timeout_src.request_abort()` twice in the `with_timeout` function, first by timeout and then by shutdown. `abort_source` fails on an assertion in this case. Fix this. Fixes: #12512 Closes #12514 (cherry picked from commit `54170749b8`)	2023-02-05 18:31:50 +02:00
Kefu Chai	b2699743cc	db: system_keyspace: take the reserved_memory into account before this change, we returns the total memory managed by Seastar in the "total" field in system.memory. but this value only reflect the total memory managed by Seastar's allocator. if `reserve_additional_memory` is set when starting app_template, Seastar's memory subsystem just reserves a chunk of memory of this specified size for system, and takes the remaining memory. since `f05d612da8`, we set this value to 50MB for wasmtime runtime. hence the test of `TestRuntimeInfoTable.test_default_content` in dtest fails. the test expects the size passed via the option of `--memory` to be identical to the value reported by system.memory's "total" field. after this change, the "total" field takes the reserved memory for wasm udf into account. the "total" field should reflect the total size of memory used by Scylla, no matter how we use a certain portion of the allocated memory. Fixes #12522 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12573 (cherry picked from commit `4a0134a097`)	2023-02-05 18:30:05 +02:00
Botond Dénes	50ae73a4bd	types: is_tuple(): handle reverse types Currently reverse types match the default case (false), even though they might be wrapping a tuple type. One user-visible effect of this is that a schema, which has a reversed<frozen<UDT>> clustering key component, will have this component incorrectly represented in the schema cql dump: the UDT will loose the frozen attribute. When attempting to recreate this schema based on the dump, it will fail as the only frozen UDTs are allowed in primary key components. Fixes: #12576 Closes #12579 (cherry picked from commit `ebc100f74f`)	2023-02-05 18:20:21 +02:00
Calle Wilund	c3dd4a2b87	alterator::streams: Sort tables in list_streams to ensure no duplicates Fixes #12601 (maybe?) Sort the set of tables on ID. This should ensure we never generate duplicates in a paged listing here. Can obviously miss things if they are added between paged calls and end up with a "smaller" UUID/ARN, but that is to be expected. (cherry picked from commit `da8adb4d26`)	2023-02-05 17:44:00 +02:00
Benny Halevy	0f9fe61d91	view: row_lock: lock_ck: find or construct row_lock under partition lock Since we're potentially searching the row_lock in parallel to acquiring the read_lock on the partition, we're racing with row_locker::unlock that may erase the _row_locks entry for the same clustering key, since there is no lock to protect it up until the partition lock has been acquired and the lock_partition future is resolved. This change moves the code to search for or allocate the row lock _after_ the partition lock has been acquired to make sure we're synchronously starting the read/write lock function on it, without yielding, to prevent this use-after-free. This adds an allocation for copying the clustering key in advance even if a row_lock entry already exists, that wasn't needed before. It only us slows down (a bit) when there is contention and the lock already existed when we want to go locking. In the fast path there is no contention and then the code already had to create the lock and copy the key. In any case, the penalty of copying the key once is tiny compared to the rest of the work that view updates are doing. This is required on top of `5007ded2c1` as seen in https://github.com/scylladb/scylladb/issues/12632 which is closely related to #12168 but demonstrates a different race causing use-after-free. Fixes #12632 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit `4b5e324ecb`)	2023-02-05 17:22:31 +02:00
Anna Stuchlik	59d30ff241	docs: fixes https://github.com/scylladb/scylladb/issues/12654 , update the links to the Download Center Closes #12655 (cherry picked from commit `64cc4c8515`)	2023-02-05 17:19:56 +02:00
Anna Stuchlik	fb82dff89e	doc: fixes https://github.com/scylladb/scylladb/issues/12672 , fix the redirects to the Cloud docs Closes #12673 (cherry picked from commit `2be131da83`)	2023-02-05 17:17:35 +02:00
Kefu Chai	b588b19620	cql3/selection: construct string_view using char* not size before this change, we construct a sstring from a comma statement, which evaluates to the return value of `name.size()`, but what we expect is `sstring(const char, size_t)`. in this change instead of passing the size of the string_view, both its address and size are used * `std::string_view` is constructed instead of sstring, for better performance, as we don't need to perform a deep copy the issue is reported by GCC-13: ``` In file included from cql3/selection/selectable.cc:11: cql3/selection/field_selector.hh:83:60: error: ignoring return value of function declared with 'nodiscard' attribute [-Werror,-Wunused-result] auto sname = sstring(reinterpret_cast<const char*>(name.begin(), name.size())); ^~~~~~~~~~ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #12666 (cherry picked from commit `186ceea009`) Fixes #12739.	2023-02-05 13:50:48 +02:00
Michał Chojnowski	608ef92a71	commitlog: fix total_size_on_disk accounting after segment file removal Currently, segment file removal first calls `f.remove_file()` and does `total_size_on_disk -= f.known_size()` later. However, `remove_file()` resets `known_size` to 0, so in effect the freed space in not accounted for. `total_size_on_disk` is not just a metric. It is also responsible for deciding whether a segment should be recycled -- it is recycled only if `total_size_on_disk - known_size < max_disk_size`. Therefore this bug has dire performance consequences: if `total_size_on_disk - known_size` ever exceeds `max_disk_size`, the recycling of commitlog segments will stop permanently, because `total_size_on_disk - known_size` will never go back below `max_disk_size` due to the accounting bug. All new segments from this point will be allocated from scratch. The bug was uncovered by a QA performance test. It isn't easy to trigger -- it took the test 7 hours of constant high load to step into it. However, the fact that the effect is permanent, and degrades the performance of the cluster silently, makes the bug potentially quite severe. The bug can be easily spotted with Prometheus as infinitely rising `commitlog_total_size_on_disk` on the affected shards. Fixes #12645 Closes #12646 (cherry picked from commit `fa7e904cd6`)	2023-02-01 21:54:37 +02:00
Kamil Braun	d2732b2663	Merge 'Enable Raft by default in new clusters' from Kamil Braun New clusters that use a fresh conf/scylla.yaml will have `consistent_cluster_management: true`, which will enable Raft, unless the user explicitly turns it off before booting the cluster. People using existing yaml files will continue without Raft, unless consistent_cluster_management is explicitly requested during/after upgrade. Also update the docs: cluster creation and node addition procedures. Fixes #12572. Closes #12585 * github.com:scylladb/scylladb: docs: mention `consistent_cluster_management` for creating cluster and adding node procedures conf: enable `consistent_cluster_management` by default (cherry picked from commit `5c886e59de`)	2023-01-26 12:21:55 +01:00
Anna Mikhlin	34ab98e1be	release: prepare for 5.2.0-rc0 scylla-5.2.0-rc0	2023-01-18 14:54:36 +02:00
Tomasz Grabiec	563998b69a	Merge 'raft: improve group 0 reconfiguration failure handling' from Kamil Braun Make it so that failures in `removenode`/`decommission` don't lead to reduced availability, and any leftovers in group 0 can be removed by `removenode`: - In `removenode`, make the node a non-voter before removing it from the token ring. This removes the possibility of having a group 0 voting member which doesn't correspond to a token ring member. We can still be left with a non-voter, but that's doesn't reduce the availability of group 0. - As above but for `decommission`. - Make it possible to remove group 0 members that don't correspond to token ring members from group 0 using `removenode`. - Add an API to query the current group 0 configuration. Fixes #11723. Closes #12502 * github.com:scylladb/scylladb: test: test_topology: test for removing garbage group 0 members test/pylib: move some utility functions to util.py db: system_keyspace: add a virtual table with raft configuration db: system_keyspace: improve system.raft_snapshot_config schema service: storage_service: better error handling in `decommission` service: storage_service: fix indentation in removenode service: storage_service: make `removenode` work for group 0 members which are not token ring members service/raft: raft_group0: perform read_barrier in wait_for_raft service: storage_service: make leaving node a non-voter before removing it from group 0 in decommission/removenode test: test_raft_upgrade: remove test_raft_upgrade_with_node_remove service/raft: raft_group0: link to Raft docs where appropriate service/raft: raft_group0: more logging service/raft: raft_group0: separate function for checking and waiting for Raft	2023-01-17 21:23:15 +01:00
Kamil Braun	d134c458e5	test/pylib: increase timeout when waiting for cluster before test Increase the timeout from default 5 minutes to 10 minutes. Sent as a workaround for #12546 to unblock next promotions. Closes #12547	2023-01-17 21:03:09 +02:00
Kamil Braun	4f1c317bdc	test: test_raft_upgrade: stop servers gracefully in test_recovery_after_majority_loss This test is frequently failing due to a timeout when we try to restart one of the nodes. The shutdown procedure apparently hangs when we try to stop the `hints_manager` service, e.g.: ``` INFO 2023-01-13 03:18:02,946 [shard 0] hints_manager - Asked to stop INFO 2023-01-13 03:18:02,946 [shard 0] hints_manager - Stopped INFO 2023-01-13 03:18:02,946 [shard 0] hints_manager - Asked to stop INFO 2023-01-13 03:18:02,946 [shard 1] hints_manager - Asked to stop INFO 2023-01-13 03:18:02,946 [shard 1] hints_manager - Stopped INFO 2023-01-13 03:18:02,946 [shard 1] hints_manager - Asked to stop INFO 2023-01-13 03:18:02,946 [shard 1] hints_manager - Stopped INFO 2023-01-13 03:22:56,997 [shard 0] hints_manager - Stopped ``` observe the 5 minute delay at the end. There is a known issue about `hints_manager` stop hanging: #8079. Now, for some reason, this is the only test case that is hitting this issue. We don't completely understand why. There is one significant difference between this test case and others: this is the only test case which kills 2 (out of 3) servers in the cluster and then tries to gracefully shutdown the last server. There's a hypothesis that the last server gets stuck trying to send hints to the killed servers. We weren't able to prove/falsify it yet. But if it's true, then this patch will: - unblock next promotions, - give us some important information when we see that the issue stops appearing. In the patch we shutdown all servers gracefully instead of killing them, like we do in the other test cases. Closes #12548	2023-01-17 20:51:09 +02:00
Pavel Emelyanov	4f415413d2	raft: Fix non-existing state_machine::apply_entry in docs The docs mention that method, but it doesn't exist. Instead, the state_machine interface defines plain .apply() one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #12541	2023-01-17 12:53:05 +01:00
Kamil Braun	5545547d07	test: test_topology: test for removing garbage group 0 members Verify that `removenode` can remove group 0 members which are not token ring members.	2023-01-17 12:28:00 +01:00
Kamil Braun	c959ec455a	test/pylib: move some utility functions to util.py They were used in test_raft_upgrade, but we want to use them in other test files too.	2023-01-17 12:28:00 +01:00
Kamil Braun	a483915c62	db: system_keyspace: add a virtual table with raft configuration Add a new virtual table `system.raft_state` that shows the currently operating Raft configuration for each present group. The schema is the same as `system.raft_snapshot_config` (the latter shows the config from the last snapshot). In the future we plan to add more columns to this table, showing more information (like the current leader and term), hence the generic name. Adding the table requires some plumbing of `sharded<raft_group_registry>&` through function parameters to make it accessible from `register_virtual_tables`, but it's mostly straightforward. Also added some APIs to `raft_group_registry` to list all groups and find a given group (returning `nullptr` if one isn't found, not throwing an exception).	2023-01-17 12:28:00 +01:00

1 2 3 4 5 ...

34676 Commits