scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-07 07:23:15 +00:00

Author	SHA1	Message	Date
Botond Dénes	581edc4e4e	reader_concurrency_semaphore: make inactive_read_handle a weak reference Having the handle keep an owning reference to the inactive read lead to awkward situations, where the inactive read is destroyed during eviction in certain situations only (querier cache) and not in other cases. Although the users didn't notice anything from this, it lead to very brittle code inside the reader concurrency semaphore. Among others, the inactive read destructor has to be open coded in evict() which already lead to mistakes. This patch goes back to the weak pointer paradigm used a while ago, which is a much more natural fit for this. Inactive reads are still kept in an intrusive list in the semaphore but the handle now keeps a weak pointer to them. When destroyed the handler will destroy the inactive read if it is still alive. When evicting the inactive read, it will set the pointer in the handle to null.	2021-03-18 14:57:57 +02:00
Botond Dénes	cbc83b8b1b	reader_concurrency_semaphore: make evict() noexcept In the next patch it will be called from a destructor.	2021-03-18 14:57:57 +02:00
Botond Dénes	2d348e0211	reader_concurrency_semaphore: update out-of-date comments	2021-03-18 14:57:57 +02:00
Gleb Natapov	32d386d0d8	raft: fix use after free during logging in append_entries_reply() As the existing comment explains a progress can be deleted at the point of logging. The logging should only be done if the progress still exists. Message-Id: <YFDFVRQU1iVYhFdM@scylladb.com>	2021-03-17 09:59:22 +02:00
Dejan Mircevski	8db24fc03b	cql3/expr: Handle `IN ?` bound to null Previously, we crashed when the IN marker is bound to null. Throw invalid_request_exception instead. Fixes #8265 Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com> Closes #8287	2021-03-17 09:59:22 +02:00
Avi Kivity	1afd6fbe06	hashing: appending_hash: convert from enable_if to concepts A little simpler to understand. Closes #8288	2021-03-17 09:59:22 +02:00
Piotr Sarna	7961a28835	Merge 'storage_proxy: Include counter writes in... ... `writes_coordinator_outside_replica_set`' from Juliusz Stasiewicz With this change, coordinator prefers himself as the "counter leader", so if another endpoint is chosen as the leader, we know that coordinator was not a member of replica set. With this guarantee we can increment `scylla_storage_proxy_coordinator_writes_coordinator_outside_replica_set` metric after electing different leader (that metric used to neglect the counter updates). The motivation for this change is to have more reliable way of counting non-token-aware queries. Fixes #4337 Closes #8282 * github.com:scylladb/scylla: storage_proxy: Include counter writes in `writes_coordinator_outside_replica_set` counters: Favor coordinator as leader	2021-03-17 09:59:22 +02:00
Avi Kivity	972ea9900c	Merge 'commitlog: Make pre-allocation drop O_DSYNC while pre-filling' from Calle Wilund Refs #7794 Iff we need to pre-fill segment file ni O_DSYNC mode, we should drop this for the pre-fill, to avoid issuing flushes until the file is filled. Done by temporarily closing, re-opening in "normal" mode, filling, then re-opening. Closes #8250 * github.com:scylladb/scylla: commitlog: Make pre-allocation drop O_DSYNC while pre-filling commitlog: coroutinize allocate_segment_ex	2021-03-17 09:59:22 +02:00
Dejan Mircevski	992d5c6184	cql3/expr: Improve column printing Before this change, we would print an expression like this: ((ColumnDefinition{name=c, type=org.apache.cassandra.db.marshal.Int32Type, kind=CLUSTERING_COLUMN, componentIndex=0, droppedAt=-9223372036854775808}) = 0000007b) Now, we print the same expression like this: (c = 0000007b) Tests: unit (dev) Signed-off-by: Dejan Mircevski <dejan@scylladb.com> Closes #8285	2021-03-17 09:59:22 +02:00
Tomasz Grabiec	40121621f6	Merge "Kill some get_local_migration_manager() calls" from Pavel Emelyanov There are a bunch of such calls in schema altering statements and there's currently no way to obtain the migration manager for such statements, so a relatively big rework needed. The solution in this set is -- all statements' execute() methods are called with query processor as first argument (now the storage proxy is there), query processor references and provides migration manager for statements. Those statements that need proxy can already get it from the query processor. Afterwards table_helper and thrift code can also stop using the global migration manager instance, since they both have query processor in needed places. While patching them a couple of calls to global storage proxy also go away. The new query processor -> migration manager dependency fits into current start-stop sequence: the migration manager is started early, the query processor is started after it. On stop the query processor remains alive, but the migration manager stops. But since no code currently (should) call get_local_migration_manager() it will _not_ call the query_processor::get_migration_manager() either, so this dangling reference is ugly, but safe. Another option could be to make storage proxy reference migration manager, but this dependency doesn't look correct -- migration manager is higher-level service than the storage proxy is, it is migration manager who currently calls storage proxy, but not the vice versa. * xemul/br-kill-some-migration-managers-2: cql3: Get database directly from query processor thrift: Use query_processor::get_migration_manager() table_helper: Use query_processor::get_migration_manager() cql3: Use query_processor::get_migration_manager() (lambda captures cases) cql3: Use query_processor::get_migration_manager() (alter_type statement) cql3: Use query_processor::get_migration_manager() (trivial cases) query_processor: Keep migration manager onboard cql3: Pass query processor to announce_migration:s cql3: Switch to qp (almost) in schema-altering-stmt cql3: Change execute()'s 1st arg to query_processor	2021-03-17 09:59:22 +02:00
Raphael S. Carvalho	2065e2c912	partitioned_sstable_set: adjust select_sstable_runs() to work with compound set compound set will select runs from all of its managed sets, so let's adjust select_sstable_runs() to only return runs which belong to it. without this adjustment, selection of runs would fail because function would try to unconditionally retrieve the run which may live somewhere else. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210312042255.111060-3-raphaelsc@scylladb.com>	2021-03-17 09:59:22 +02:00
Raphael S. Carvalho	02b2df1ea9	sstable_set: move select_sstable_runs() into partitioned_sstable_set after compound set is introduced, select_sstable_runs() will no longer work because the sstable runs live in sstable_set, but they should actually live in the sstable_set being written to. Given that runs is a concept that belongs only to strategies which use partitioned_sstable_set, let's move the implementation of select_sstable_runs() to it. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210312042255.111060-2-raphaelsc@scylladb.com>	2021-03-17 09:59:22 +02:00
Avi Kivity	11308c05f4	Update tools/jmx submodule * tools/jmx 15c1d4f...9c687b5 (1): > dist/redhat: add support SLES	2021-03-17 09:59:22 +02:00
Calle Wilund	a0745f9498	messaging_service: Enforce dc/rack membership iff required for non-tls connections When internode_encryption is "rack" or "dc", we should enforce incoming connections are from the appropriate address spaces iff answering on non-tls socket. This is implemented by having two protocol handlers. One for tls/full notls, and one for mixed (needs checking) connections. The latter will ask snitch if remote address is kosher, and refuse the connection otherwise. Note: requires seastar patches: "rpc: Make is possible for rpc server instance to refuse connection" "RPC: (client) retain local address and use on stream creation" Note that ip-level checks are not exhaustive. If a user is also using "require_client_auth" with dc/rack tls setting we should warn him that there is a possibility that someone could spoof himself pass the authentication. Closes #8051	2021-03-17 09:59:22 +02:00
Avi Kivity	bcd41cb32d	Merge 'Support installing our rpm to SLES' from Takuya ASADA Basically SLES support is already done in `f20736d93d`, but it was for offline installer. This fixes few more problems to install our rpm to SLES. After this change, we can just install our rpm for both CentOS/RHEL and SLES in single image, like unified deb. SLES uses original package manager called 'zypper', but it does support yum repository so no need to change required for repo. Closes #8277 * github.com:scylladb/scylla: scylla_coredump_setup: support SLES scylla_setup: use rpm to check package availability for SLES dist: install optional packages for SLES	2021-03-17 09:59:22 +02:00
Tomasz Grabiec	cc0bb92afe	Merge "raft: provide a ticker for each raft server" from Pavel Solodovnikov Automatically initialize and start a timer in `raft_services::add_server` for each raft server instance created. The patch set also changes several other things in order for tickers to work: 1. A bug in `raft_sys_table_storage` which caused an exception if `raft::server::start` is called without any persisted state. 2. `raft_services::add_server` now automatically calls `raft::server::start()` since a server instance should be started before any of its methods can be called. 3. Raft servers can now start with initial term = 0. There was an artificial restriction which is now lifted. 4. Raft schema state machine now returns a ready future instead of throwing "not implemented" exception in `abort()`. * github.com/ManManson/scylla.git/raft_services_tickers_v9_next_rebase: raft/raft_services: provide a ticker for each raft server raft/raft_services: switch from plain `throw` to `on_internal_error` raft/raft_services: start server instance automatically in `add_server` raft: return ready future instead of throwing in schema_raft_state_machine raft: allow raft server to start with initial term 0 raft/raft_sys_table_storage: fix loading term/vote and snapshot from empty state	2021-03-17 09:59:22 +02:00
Nadav Har'El	e344f74858	Merge 'logalloc: improve background reclaim shares management' from Avi Kivity The log structured allocator's background reclaimer tries to allocate CPU power proportional to memory demand, but a bug made that not happen. Fix the bug, add some logging, and future-proof the timer. Also, harden the test against overcommitted test machines. Fixes #8234. Test: logalloc_test(dev), 20 concurrent runs on 2 cores (1 hyperthread each) Closes #8281 * github.com:scylladb/scylla: test: logalloc_test: harden background reclain test against cpu overcommit logalloc: background reclaim: use default scheduling group for adjusting shares logalloc: background reclaim: log shares adjustment under trace level logalloc: background reclaim: fix shares not updated by periodic timer	2021-03-17 09:59:21 +02:00
Pavel Solodovnikov	aaea8c6c7d	raft/raft_services: provide a ticker for each raft server Automatically initialize a ticker for each raft server instance when `raft_services::add_server` is called. A ticker is a timer which regularly calls `raft::server::tick` in order to tick its raft protocol state machine. Note that the timer should start after the server calls its `start()` method, because otherwise it would crash since fsm is not initialized yet. Currently, the tick interval is hardcoded to be 100ms. Tests: unit(dev, debug) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-03-17 09:59:21 +02:00
Pavel Solodovnikov	1496a3559f	raft/raft_services: switch from plain `throw` to `on_internal_error` Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-03-17 09:59:21 +02:00
Pavel Solodovnikov	975c9a8021	raft/raft_services: start server instance automatically in `add_server` Raft server instance cannot be used in any way prior to calling the `start()` method, which initializes its internal state, e.g. raft protocol state machine. Otherwise, it will likely result in a crash. Also, properly stop the servers on shutdown via `raft_services::stop_servers()`. In case some exception happened inside `add_server`, the `init` function will de-initialize what it already initialized, i.e. raft rpc verbs. This is important since otherwise it would break further initialization process and, what is more important, will prevent raft rpc verbs deinitialization. This will cause a crash in `messaging_service` uninit procedure, because raft rpc handlers would still be initialized. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-03-17 09:59:21 +02:00
Pavel Solodovnikov	0b3dba07bd	raft: return ready future instead of throwing in schema_raft_state_machine The current implementation throws an exception, which will cause a crash when stopping scylla. This will be used in the next patch. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-03-17 09:59:21 +02:00
Pavel Solodovnikov	93c565a1bf	raft: allow raft server to start with initial term 0 Prior to the fix there was an assert to check in `raft::server_impl::start` that the initial term is not 0. This restriction is completely artificial and can be lifted without any problems, which will be described below. The only place that is dependent on this corner case is in `server_impl::io_fiber`. Whenever term or vote has changed, they will be both set in `fsm::get_output`. `io_fiber` checks whether it needs to persist term and vote by validating that the term field is set (by actually executing a `term != 0` condition). This particular check is based on an unobvious fact that the term will never be 0 in case `fsm::get_output` saves term and vote values, indicating that they need to be persisted. Vote and term can change independently of each other, so that checking only for term obscures what is happening and why even more. In either case term will never be 0, because: 1. If the term has changed, then it's naturally greater than 0, since it's a monotonically increasing value. 2. If the vote has changed, it means that we received a vote request message. In such case we have already updated our term to the requester's term. Switch to using an explicit optional in `fsm_output` so that a reader don't have to think about the motivation behind this `if` and just checks that `term_and_vote` optional is engaged. Given the motivation described above, the corresponding assert(_fsm->get_current_term() != term_t(0)); in `server_impl::start` is removed. Tests: unit(dev) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-03-17 09:59:21 +02:00
Pavel Solodovnikov	ae5f26adec	raft/raft_sys_table_storage: fix loading term/vote and snapshot from empty state When a raft server is started for the first time and there isn't any persisted state yet, provide default return values for `load_term_and_vote` and `load_snapshot`. The code currently does not handle this corner case correctly and fail with an exception. Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>	2021-03-17 09:59:21 +02:00
Juliusz Stasiewicz	f77d0f5439	storage_proxy: Include counter writes in `writes_coordinator_outside_replica_set` Coordinator prefers himself as the "counter leader", so if another endpoint is chosen as the leader, we know that coordinator was not a member of replica set. We can use this information to increment relevant metric (which used to neglect the counters completely). Fixes #4337	2021-03-16 12:07:16 +01:00
Juliusz Stasiewicz	5689106b92	counters: Favor coordinator as leader This not only reduces internode traffic but is also needed for a later change in this PR: metrics for non-token-aware writes including counter updates.	2021-03-16 12:07:13 +01:00
Pavel Emelyanov	12e4269dce	cql3: Get database directly from query processor After previous patches some places in cql3 code take a long path to get database reference: query processor -> storage proxy -> database The query processor can provide the database reference by itself, so take this chance. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-03-15 19:36:04 +03:00
Pavel Emelyanov	fb49550943	thrift: Use query_processor::get_migration_manager() Thrift needs migration manager to call announce_<something> on it and currently it grabs blobak migration manager instance. Since thrift handler has query processor rerefence onboard and the query processor can provide the migration manager reference, it's time to remove few more globals from thrift code. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-03-15 19:35:59 +03:00
Pavel Emelyanov	6dc9a16b4e	table_helper: Use query_processor::get_migration_manager() After the migration manager can be obtained from the query processor the table heler can also benefit from it and not call for global migration manager instance any longer. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-03-15 19:35:53 +03:00
Pavel Emelyanov	a9646dd779	cql3: Use query_processor::get_migration_manager() (lambda captures cases) There are few schema altering statements that need to have the query processor inside lambda continuations. Fortunately, they all are continuations of make_ready_future<>()s, so the query processor can be simply captured by reference and used. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-03-15 19:35:48 +03:00
Pavel Emelyanov	50e4eacd08	cql3: Use query_processor::get_migration_manager() (alter_type statement) This statement needs the query processor one step below the stack from its .announce_migration method. So here's the dedicated patch for it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-03-15 19:35:43 +03:00
Pavel Emelyanov	464e58abf7	cql3: Use query_processor::get_migration_manager() (trivial cases) Most of the schema altering statements implementations can now stop calling for global migration manager instance and get it from the query processor. Here are the trivial cases when the query processor is just avaiable at the place where it's needed. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-03-15 19:35:36 +03:00
Pavel Emelyanov	1de235f4da	query_processor: Keep migration manager onboard The query processor sits upper than the migration manager, in the services layering, it's started after and (will be) stopped before the migration manager. The migration manager is needed in schema altering statements which are called with query processor argument. They will later get the migration manager from the query processor. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-03-15 19:00:58 +03:00
Pavel Emelyanov	1e8f0963f9	cql3: Pass query processor to announce_migration:s Now when the only call to .announce_migration gas the query processor at hands -- pass it to the real statements. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-03-15 19:00:33 +03:00
Pavel Emelyanov	470928dd94	cql3: Switch to qp (almost) in schema-altering-stmt The schema altering statements are all inherited from the same base class which delcares a pure virtual .announce_migration() method. All the real statements are called with storage proxy argument, while the need the migration manager. So like in the previous patch -- replace storage proxy with query processor. While doing the replacement also get the database instance from the querty processor, not from proxy. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-03-15 19:00:33 +03:00
Pavel Emelyanov	26c115f379	cql3: Change execute()'s 1st arg to query_processor Currently the statement's execute() method accepts storage proxy as the first argument. This is enough for all of them but schema altering ones, because the latter need to call migration manager's announce. To provide the migration manager to those who need it it's needed to have some higher-level service that the proxy. The query processor seems to be good candidate for it. Said that -- all the .execute()s now accept the querty processor instead of the proxy and get the proxy itself from the query processor. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-03-15 19:00:33 +03:00
Avi Kivity	65fea203d2	test: logalloc_test: harden background reclain test against cpu overcommit Use thread CPU time instead of real time to avoid an overcommitted machine from not being able to supply enough CPU for the test.	2021-03-15 13:54:49 +02:00
Avi Kivity	290897ddbc	logalloc: background reclaim: use default scheduling group for adjusting shares If the shares are currently low, we might not get enough CPU time to adjust the shares in time. This is currently no-op, since Seastar runs the callback outside scheduling groups (and only uses the scheduling group for inherited continuations); but better be insulated against such details.	2021-03-15 13:54:49 +02:00
Avi Kivity	a87f6498c3	logalloc: background reclaim: log shares adjustment under trace level Useful when debugging, but too noisy at any other time.	2021-03-15 13:54:49 +02:00
Avi Kivity	ce1b1d6ec4	logalloc: background reclaim: fix shares not updated by periodic timer adjust_shares() thinks it needs to do nothing if the main loop is running, but in reality it can only avoid waking the main loop; it still needs to adjust the shares unconditionally. Otherwise, the background reclaim shares can get locked into a low value. Fix by splitting the conditional into two.	2021-03-15 13:54:37 +02:00
Tomasz Grabiec	bf6c4e0b24	Merge "raft: consolidate tests in raft directory" from Alejo Move boost tests to tests/raft and factor out common helpers. * alejo/raft-tests-reorg-5-rebase-next-2: raft: tests: move common helpers to header raft: tests: move boost tests to tests/raft	2021-03-15 11:59:16 +01:00
Takuya ASADA	e8cfd5114f	scylla_coredump_setup: support SLES SLES requires to install systemd-coredump package and enable systemd-coredump.socket to use systemd-coredump.	2021-03-15 19:19:56 +09:00
Takuya ASADA	13871ff1f8	scylla_setup: use rpm to check package availability for SLES Use rpm to check scylla packages installed on SLES.	2021-03-15 19:18:44 +09:00
Takuya ASADA	e3b5ffcf14	dist: install optional packages for SLES Support SUSE original package manager 'zypper' for pkg_install() function.	2021-03-15 19:17:48 +09:00
Alejo Sanchez	88063b6e3e	raft: tests: move common helpers to header Move common test helper functions and data structures to a common helpers.hh header. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-03-15 06:16:58 -04:00
Alejo Sanchez	6139ad6337	raft: tests: move boost tests to tests/raft Move raft boost tests to test/raft directory. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-03-15 06:16:58 -04:00
Calle Wilund	48ca01c3ab	commitlog: Make pre-allocation drop O_DSYNC while pre-filling Refs #7794 Iff we need to pre-fill segment file ni O_DSYNC mode, we should drop this for the pre-fill, to avoid issuing flushes until the file is filled. Done by temporarily closing, re-opening in "normal" mode, filling, then re-opening. v2: * More comment v3: * Add missing flush v4: * comment v5: * Split coroutine and fix into separate patches	2021-03-15 09:35:45 +00:00
Calle Wilund	ae3b8e6fdf	commitlog: coroutinize allocate_segment_ex To make further changes here easier to write and read.	2021-03-15 09:35:37 +00:00
Avi Kivity	f326a2253c	Update tools/java submodule * tools/java 2c6110500c...fdc8fcc22c (1): > sstableloader: Use compound "where" restrictions for clustering	2021-03-15 11:19:22 +02:00
Raphael S. Carvalho	7171244844	compaction_manager: Fix performance of cleanup compaction due to unlimited parallelism Prior to `463d0ab`, only one table could be cleaned up at a time on a given shard. Since then, all tables belonging to a given keyspace are cleaned up in parallel. Cleanup serialization on each shard was enforced with a semaphore, which was incorrectly removed by the patch aforementioned. So space requirement for cleanup to succeed can be up to the size of keyspace, increasing the chances of node running out of space. Node could also run out of memory if there are tons of tables in the keyspace. Memory requirement is at least #_of_tables * 128k (not taking into account write behind, etc). With 5k tables, it's ~0.64G per shard. Also all tables being cleaned up in parallel will compete for the same disk and cpu bandwidth, so making them all much slower, and consequently the operation time is significantly higher. This problem was detected with cleanup, but scrub and upgrade go through the same rewrite procedure, so they're affected by exact the same problem. Fixes #8247. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210312162223.149993-1-raphaelsc@scylladb.com>	2021-03-14 14:31:26 +02:00
Nadav Har'El	d73934372d	storage_service: correct missing exception in logging rebuild failure When failing to rebuild a node, we would print the error with the useless explanation "<no exception>". The problem was a typo in the logging command which used std::current_exception() - which wasn't relevant in that point - instead of "ep". Refs #8089 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210314113118.1690132-1-nyh@scylladb.com>	2021-03-14 14:11:11 +02:00

1 2 3 4 5 ...

25526 Commits