scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-09 00:13:31 +00:00

Author	SHA1	Message	Date
Nadav Har'El	c5f29fe3ea	configure.py: don't use deprecated mktemp() configure.py uses the deprecated Python function tempfile.mktemp(). Because this function is labeled a "security risk" it is also a magnet for automated security scanners... So let's replace it with the recommended tempfile.mkstemp() and avoid future complaints. The actual security implications of this mktemp() call is negligible to non-existent: First it's just the build process (configure.py), not the build product itself. Second, the worst that an attacker (which needs to run in the build machine!) can do is to cause a compilation test in configure.py to fail because it can't write to its output file. Reported by @srikanthprathi Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220111121924.615173-1-nyh@scylladb.com>	2022-01-11 17:06:14 +02:00
Botond Dénes	97d74de8fc	Merge "flat_mutation_reader: clone evictable_reader & convert some others" from Michael Livshin " The first patch introduces evictable_reader_v2, and the second one further simplifies it. We clone instead of converting because there is at least one downstream (by way of multishard_combining_reader) use that is not itself straightforward to convert at the moment (multishard_mutation_query), and because evictable_reader instances cannot be {up,down}graded (since users also access the undelying buffers). This also means that shard_reader, reader_lifecycle_policy and multishard_combining_reader have to be cloned. " * tag 'clone-evictable-reader-to-v2/v3' of https://github.com/cmm/scylla: convert make_multishard_streaming_reader() to flat_mutation_reader_v2 convert table::make_streaming_reader() to flat_mutation_reader_v2 convert make_flat_multi_range_reader() to flat_mutation_reader_v2 view_update_generator: remove unneeded call to downgrade_to_v1() introduce multishard_combining_reader_v2 introduce shard_reader_v2 introduce the reader_lifecycle_policy_v2 abstract base evictable_reader_v2: further code simplifications introduce evictable_reader_v2 & friends	2022-01-11 17:01:08 +02:00
Botond Dénes	d21803c5d0	Merge "Remove global storage proxy from pagers code" from Pavel Emelyanov " The fix is in keeping shared proxy pointer on query_pager. tests: unit(dev) " * 'br-keep-proxy-on-pager-2' of https://github.com/xemul/scylla: pager: Use local proxy pointer pager: Keep shared pointer to proxy onboard	2022-01-11 17:01:08 +02:00
Nadav Har'El	9d0eaeb90a	test/scylla-gdb: enable test for "scylla fiber" After the rewrite of the test/scylla-gdb, the test for "scylla fiber" was disabled - and this patch brings it back. For the "scylla fiber" operation to do something interesting (and not just print an error message and seem to succeed...) it needs a real task pointer. The old code interrupted Scylla in a breakpoint and used get_local_tasks(), but in the new test framework we attach to Scylla while it's idle, so there are no ready tasks. So in this patch we use the find_vptrs() function to find a continuation from http_server::do_accept_one() - it has an interesting fiber of 5 continuations. After this patch all 33 tests in test/scylla-gdb/test_misc.py pass. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220110211813.581807-1-nyh@scylladb.com>	2022-01-11 17:01:08 +02:00
Avi Kivity	861cc1d304	Update seastar submodule * seastar 28fe4214e5...ae8d1c28a2 (3): > cross-tree: convert deprecated later() to yield() > future: deprecate later(), and add two alternatives > reactor: improve lowres_clock, lowres_system_clock granularity	2022-01-11 17:01:08 +02:00
Nadav Har'El	7f5ca5bf3f	Merge 'replica: move distributed_loader to replica module' from Avi Kivity distributed_loader is replica-side thing, so it belongs in the replica module ("distributed" refers to its ability to load sstables in their correct shards). So move it to the replica module. The change exposes a dependency on the construction order of static variables (which isn't defined), so we remove the dependency in the first two patches. Closes #9891 * github.com:scylladb/scylla: replica: move distributed_loader into replica module tracing: make sure keyspace and table names are available to static constructors auth: make sure keyspace and table names are available to static constructors	2022-01-11 17:01:08 +02:00
Pavel Emelyanov	4dd1c15b7b	Merge v3 of "Deglobalize repair tracker" from Benny This series gets rid of the global repair_tracker and thread-local node_ops_metrics instances. It does so by first, make the repair_tracker sharded, with an instance per repair_service shard. The, exposing the repair_service::repair_tracker and keeping a reference to the repair_service in repair_info. Then the node_ops_metrics instances are moved from thread-local global variables to class repair_service. The motivation for this series is two fold: 1. There is a global effor the get rid of global services and instantiate all services on the stack of main() or cql_test_env. 2. As part of https://github.com/scylladb/scylla/issues/9809, we would like to eventually use a generci job tracer for both repair and compaction, so this would be one of the prelimanry steps to get there. Refs #9809 Test: unit(release) (including scylla-gdb) Dtest: repair_additional_test.py::TestRepairAdditional::{test_repair_disjoint_row_2nodes,test_repair_joint_row_3nodes_2_diff_shard_count} replace_address_test.py::TestReplaceAddress::test_serve_writes_during_bootstrap[rbo_enabled] (Still seeing https://github.com/scylladb/scylla/issues/9785 but nothing worse) * github.com:bhalevy/scylla.git deglobalize-repair-tracker-v4 repair: repair_tracker: get rid of _the_tracker repair: repair_service: move free abort_repair_node_ops function to repair_service repair_service: deglobalize node_ops_metrics repair: node_ops_metrics: fixup indentation repair: node_ops_metrics: declare in header file repair: repair_info: add check_in_shutdown method repair: use repair_info to get to the repair tracker repair: move tracker-dependent free functions to repair_service repair: tracker: mark get function const repair_service: add repair_tracker getter repair: make repair_tracker sharded repair: repair_tracker: get rid of unused abort_all_abort_source repair: repair_tracker: get rid of unused shutdown abort source	2022-01-11 17:01:08 +02:00
Nadav Har'El	261c4b80b5	Update tools/java submodule * tools/java 6249bfbe2f...b1e09c8b8f (1): > dist/debian:set either python (>=2.7) or python2	2022-01-11 17:01:08 +02:00
Michael Livshin	1f27e12dc6	convert make_multishard_streaming_reader() to flat_mutation_reader_v2 All changes are mechanical. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-01-11 10:49:26 +02:00
Michael Livshin	be5118a7c9	convert table::make_streaming_reader() to flat_mutation_reader_v2 All changes are mechanical. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-01-11 10:49:26 +02:00
Michael Livshin	221cd264db	convert make_flat_multi_range_reader() to flat_mutation_reader_v2 Mechanical changes and a resulting downgrade in one caller (which is itself converted later). Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-01-11 10:49:26 +02:00
Michael Livshin	91d38ef2a9	view_update_generator: remove unneeded call to downgrade_to_v1() Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-01-11 10:49:26 +02:00
Michael Livshin	7f0e228cbb	introduce multishard_combining_reader_v2 All changes are mechanical. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-01-11 10:49:26 +02:00
Michael Livshin	4bc0deb7e9	introduce shard_reader_v2 Needed for multishard_combining_reader_v2 (see next commit), all changes are mechanical. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-01-11 10:49:26 +02:00
Michael Livshin	6499361b6a	introduce the reader_lifecycle_policy_v2 abstract base Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-01-11 10:49:26 +02:00
Michael Livshin	b053716e74	evictable_reader_v2: further code simplifications Almost all mechanical: not passing a `reader` parameter around when we know it's the `_reader` member, folding a short one-use method into its caller. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-01-11 10:49:26 +02:00
Michael Livshin	402dbd2ca7	introduce evictable_reader_v2 & friends Cloning instead of converting because there is at least one downstream (via multishard_combining_reader) use that is not straightforward to convert (multishard_mutation_query). The clone is mostly mechanical and much simpler than the original, because it does not have to deal with range tombstones when deciding if it is safe to pause the wrapped reader, and also does not have to trim any range tombstones. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-01-11 10:49:26 +02:00
Raphael S. Carvalho	49eeacff37	compaction_manager: make run_with_compaction_disabled() barrier out non-regular compactions run_with_compaction_disabled() is used to temporarily disable compaction for a table T. Not only regular compaction, but all types. Turns out it's stopping all types but it's only preventing new regular compactions from starting. So major for example can start even with compaction temporarily disabled. This is fixed by not allowing compaction of any type if disabled. This wasn't possible before as scrub incorrectly ran entirely with compaction disabled, so it wouldn't be able to start, but now it only disables compaction while retrieving its candidate list. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20220107154942.59800-1-raphaelsc@scylladb.com>	2022-01-10 18:57:16 +02:00
Raphael S. Carvalho	1c23d1099a	Make population more resilient when reshape fails Reshape isn't mandatory for correctness, unlike resharding. So we can allow boot to continue even in face of reshape failure. Without this, boot will fail right away due to unhandled exception. This is intended to make population more resilient as any exception, even "benign" ones, may cause boot to fail. It's better to allow boot to continue from where it left off, as if there's an exception like io error, or OOM, population will be unable to complete anyway. This patch was written based on observation that dangling errors in interposer consumer used by compaction can cause a different exception to be triggered, like broken_promise, when user asked reshape to stop. This can no longer happen now, but better safe than sorry. So regular compaction can now pick on backlog once node is online. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20220107130539.14899-1-raphaelsc@scylladb.com>	2022-01-10 18:57:16 +02:00
Avi Kivity	4392c20bd3	replica: move distributed_loader into replica module distributed_loader is replica-side thing, so it belongs in the replica module ("distributed" refers to its ability to load sstables in their correct shards). So move it to the replica module.	2022-01-10 15:25:28 +02:00
Avi Kivity	bfa4abaf6b	tracing: make sure keyspace and table names are available to static constructors Static constructors (specifically for the `system_keyspaces` global variable) need their dependencies to be already constructed when their own construction begins. Because tracing uses seastar::sstring, which is not constexpr, we must change it to std::string_view (which is). Change the type and perform the required adjustments. The definition is moved to the header file for simplicity.	2022-01-10 15:24:57 +02:00
Benny Halevy	50a361c280	repair: repair_tracker: get rid of _the_tracker the global _the_tracker pointer is no longer used, remove it. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-01-10 12:03:57 +02:00
Benny Halevy	ceb08b9302	repair: repair_service: move free abort_repair_node_ops function to repair_service Do not depend on the_repair_tracker(). With that, the_repair_tracker() is no longer used and should be deleted. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-01-10 11:59:22 +02:00
Benny Halevy	6bd78eb9a6	repair_service: deglobalize node_ops_metrics Embed the node_ops_metrics instance in a sharded repair_service member. Test: curl -silent http://127.0.0.1:9180/metrics \| grep node_ops \| grep -v "^#" on a freshly started scylla instance. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-01-10 11:57:54 +02:00
Benny Halevy	a9c30f47fe	repair: node_ops_metrics: fixup indentation Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-01-10 11:52:58 +02:00
Benny Halevy	91cee22792	repair: node_ops_metrics: declare in header file For de-globalizing its thread-local instance by placing a node_ops_metrics member in repair_service. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-01-10 11:52:54 +02:00
Benny Halevy	95176098d1	repair: repair_info: add check_in_shutdown method Replacing the free check_in_shutdown function. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-01-10 11:49:40 +02:00
Benny Halevy	abeca95093	repair: use repair_info to get to the repair tracker Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-01-10 11:41:10 +02:00
Benny Halevy	4db57267a6	repair: move tracker-dependent free functions to repair_service These functions are called from the api layer. Continue to hide the repair tracker from the caller but use the repair_service already available at the api layer to invoke the respective high-level methods without requiring `the_repair_tracker()`. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-01-10 11:40:09 +02:00
Benny Halevy	6f7acc2029	repair: tracker: mark get function const Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-01-10 11:26:29 +02:00
Benny Halevy	861852214c	repair_service: add repair_tracker getter And rename the global repair_tracker getter to `the_repair_tracker` as the first step to get rid of it. repair_service methods now use the repair_service::repair_tracker method. The global getter was renamed to `the_repair_tracker()` temporarily while eliminating it in this series to help distinguish it from repair_service::repair_tracker(). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-01-10 11:25:32 +02:00
Benny Halevy	2f9e701570	repair: make repair_tracker sharded Rather than keeping all shards' semaphore and repair_info:s on the tracker's single-shard instance, instantiate it on all shards, tracking the local repair jobs on its local shard. For now, until it's deglobalized, turn _the_tracker into static thread_local pointer. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-01-10 11:04:37 +02:00
Benny Halevy	415e67f3c2	repair: repair_tracker: get rid of unused abort_all_abort_source Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-01-10 10:57:10 +02:00
Benny Halevy	6650cb543b	repair: repair_tracker: get rid of unused shutdown abort source Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-01-10 10:54:57 +02:00
Pavel Emelyanov	281ce3cbc6	pager: Use local proxy pointer There are few places that need storage proxy and that use global method to acheive it. Since previous patch there's a pager local non-null pointer. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-01-10 07:58:57 +03:00
Pavel Emelyanov	095d93eaf8	pager: Keep shared pointer to proxy onboard Pagers are created by alternator and select statement, both have the proxy reference at hands. Next, the pager's unique_ptr is put on the lambda of its fetch_page() continuation and thus it survives the fetch_page execution and then gets destroyed. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-01-10 07:58:57 +03:00
Avi Kivity	05fa3e07f4	Update seastar submodule * seastar 655078dfdb...28fe4214e5 (2): > program_options: avoid including boost/program_options.hpp when possible > smp: split smp_options out of smp.hh	2022-01-09 19:56:39 +02:00
Nadav Har'El	3cc058d193	sstables: add missing include of seastar/core/metrics.hh sstables/sstables.cc uses seastar::metrics but was missing an include of <seastar/core/metrics.hh>. It probably received this include through some other random included Seastar header (e.g., smp.hh). Now that we're reducing the unnecessary inclusions in Seastar (an ongoing effort of Seastar patches), it is no longer included implicitly, and we need to include it explicitly in sstables.cc. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220109162823.511781-1-nyh@scylladb.com>	2022-01-09 18:30:50 +02:00
Nadav Har'El	63bd0807b4	test/scylla-gdb: skip tests on aarch64 As already noted in commit `eac6fb8`, many of the scylla-gdb tests fail on aarch64 for various reasons. The solution used in that commit was to have test/scylla-gdb/run pretend to succeed - without testing anything - when not running on x86_64. This workaround was accidentally lost when scylla-gdb/run was recently rewritten. This patch brings this workaround back, but in a slightly different form - Instead of the run script not doing anything, the tests do get called, but the "gdb" fixture in test/scylla-gdb/conftest.py causes each individual test to be skipped. The benefit of this approach is that it can easily be improved in the future to only skip (or xfail) specific tests which are known to fail on aarch64, instead of all of them - as half of the tests do pass on aarch64. Fixes #9892. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220109152630.506088-1-nyh@scylladb.com>	2022-01-09 17:34:23 +02:00
Avi Kivity	57188de09e	Merge 'Make dc/rack encryption work for some cases where Nat hides ednpoint ips' from Eliran Sinvani This is a consolidation of #9714 and #9709 PRs by @elcallio that were reviewed by @asias The last comment on those was that they should be consolidated in order not to create a security degradation for ec2 setups. For some cases it is impossible to determine dc or rack association for nodes on outgoing connections. One example is when some IPs are hidden behind Nat layer. In some cases this creates problems where one side of the connection is aware of the rack/dc association where the other doesn't. The solution here is a two stage one: 1. First add a gossip reverse lookup that will help us determine the rack/dc association for a broader (hopefully all) range of setups and NAT situations. 2. When this fails - be more strict about downgrading a node which tries to ensure that both sides of the connection will at least downgrade the connection instead of just fail to start when it is not possible for one side to determine rack/dc association. Fixes #9653 /cc @elcallio @asias Closes #9822 * github.com:scylladb/scylla: messaging_service: Add reverse mapping of private ip -> public endpoint production_snitch_base: Do reverse lookup of endpoint for info messaging_service: Make dc/rack encryption check for connection more strict	2022-01-09 16:40:49 +02:00
Nadav Har'El	7b5a8d3bcc	init.hh: add missing include of boost/program_options.hpp init.hh relies on boost::program_options but forgot to include the header file <boost/program_options.hpp> for it. Today, this doesn't matter, because Seastar unnecessarily includes <boost/program_options.hpp> from unrelated header files (such as smp.hh) - so it ends up not being missing. But we plan to clean up Seastar from those unnecessary includes, and then including what we need in init.hh will become important. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220109123152.492466-1-nyh@scylladb.com>	2022-01-09 15:56:58 +02:00
Avi Kivity	7f285965d8	auth: make sure keyspace and table names are available to static constructors Static constructors (specifically for the `system_keyspaces` global variable) need their dependencies to be already constructed when their own construction begins. Enforce that for auth keyspace and table names using the constinit keyword.	2022-01-09 12:51:22 +02:00
Avi Kivity	6c53717a39	replica, atomic_cell: move atomic_cell merge code from replica module to atomic_cell.cc compare_atomic_cell_for_merge() was placed in database.cc, before atomic_cell.cc existed. Move it to its correct place. Closes #9889	2022-01-09 11:08:10 +02:00
Botond Dénes	0f60cc84f4	Merge 'replica: create a replica module' from Avi Kivity Move the ::database, ::keyspace, and ::table classes to a new replica namespace and replica/ directory. This designates objects that only have meaning on a replica and should not be used on a coordinator (but note that not all replica-only classes should be in this module, for example compaction and sstables are lower-level objects that deserve their own modules). The module is imperfect - some additional classes like distributed_loader should also be moved, but there is only one way to untie Gordian knots. Closes #9872 * github.com:scylladb/scylla: replica: move ::database, ::keyspace, and ::table to replica namespace database: Move database, keyspace, table classes to replica/ directory	2022-01-07 13:37:40 +02:00
Avi Kivity	bbad8f4677	replica: move ::database, ::keyspace, and ::table to replica namespace Move replica-oriented classes to the replica namespace. The main classes moved are ::database, ::keyspace, and ::table, but a few ancillary classes are also moved. There are certainly classes that should be moved but aren't (like distributed_loader) but we have to start somewhere. References are adjusted treewide. In many cases, it is obvious that a call site should not access the replica (but the data_dictionary instead), but that is left for separate work. scylla-gdb.py is adjusted to look for both the new and old names.	2022-01-07 12:04:38 +02:00
Raphael S. Carvalho	07fba4ab5d	compaction_manager: Abort reshape for tables waiting for a chance to run Tables waiting for a chance to run reshape wouldn't trigger stop exception, as the exception was only being triggered for ongoing compactions. Given that stop reshape API must abort all ongoing tasks and all pending ones, let's change run_custom_job() to trigger the exception if it found that the pending task was asked to stop. Tests: dtest: compaction_additional_test.py::TestCompactionAdditional::test_stop_reshape_with_multiple_keyspaces unit: dev Fixes #9836. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20211223002157.215571-1-raphaelsc@scylladb.com>	2022-01-06 18:04:16 +02:00
Avi Kivity	ae3a360725	database: Move database, keyspace, table classes to replica/ directory The database, keyspace, and table classes represent the replica-only part of the objects after which they are named. Reading from a table doesn't give you the full data, just the replica's view, and it is not consistent since reconciliation is applied on the coordinator. As a first step in acknowledging this, move the related files to a replica/ subdirectory.	2022-01-06 17:07:30 +02:00
Raphael S. Carvalho	4c28c49bc7	compaction_manager: make return of maybe_stop_on_error less confusing maybe_stop_on_error() is confusing because it returns true if the task can be retried which goes in opposite direction of its semantics. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20220106143233.459903-1-raphaelsc@scylladb.com>	2022-01-06 16:39:15 +02:00
Avi Kivity	b850b34bcc	build: reduce inline threshold on aarch64 to 300 We see coroutine miscompiles with 600. Fixes #9881. Closes #9883	2022-01-06 15:13:27 +02:00
Nadav Har'El	6e2d29300c	test/scylla-gdb: a rewrite, using pytest This patch is an almost complete rewrite of the test/scylla-gdb framework for testing Scylla's gdb commands. The goals of this rewrite are described in issue #9864. In short, the goals are: 1. Use pytest to define individual test cases instead one long Python script. This will make it easier to add more tests, to run only individual tests (e.g., test/scylla-gdb/run somefile.py::sometest), to understand which test failed when it fails - and a lot of other pytest conveniences. 2. Instead of an ad-hoc shell script to run Scylla, gdb, and the test, use the same Python code which is used in other test suites (alternator, cql-pytest, redis, and more). The resulting handling of the temporary resources (processes, directories, IP address) is more robust, and interrupting test/scylla-gdb/run will correctly kill its child processes (both Scylla and gdb). All existing gdb tests (except one - more on this below...) were easily rewritten in the new framework. The biggest change in this patch is who starts what. Before this patch, "run" starts gdb, which in turn starts Scylla, stops it on a breakpoint, and then runs various tests. After this patch, "run" starts Scylla on its own (like it does in test/cql-pytest/run, et al.), and then gdb runs pytest - and in a pytest fixture attaches to the running Scylla process. The biggest benefit of this approach is that "run" is aware of both gdb and Scylla, and can kill both with abruptly with SIGKILL to end the test. But there's also a downside to this change: One of the tests (of "scylla fiber") needs access to some task object. Before this patch, Scylla was stopped on a breakpoint, and a task was available at that point. After this patch, we attach gdb to an idle Scylla, and the test cannot find any task to use. So the test_fiber() test fails for now. One way we could perhaps fix it is to add a breakpoint and "continue" Scylla a bit more after attaching to it. However, I could find the right breakpoint - and we may also need to send a request to Scylla to get it to reach that breakpoint. I'm still looking for a better way to have access to some "task" object we can test on. Fixes #9864. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20220102221534.1096659-1-nyh@scylladb.com>	2022-01-06 11:29:55 +02:00

1 2 3 4 5 ...

29662 Commits