scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-08 07:53:20 +00:00

Author	SHA1	Message	Date
Asias He	70147dcb5a	storage_service: Add removenode_add_ranges helper Share the code between restore_replica_count and removenode_with_stream to reduce duplication. Refs #8700	2021-05-25 10:44:31 +08:00
Asias He	a285bd28e2	storage_service: Respect --enable-repair-based-node-ops flag during removenode In commit `829b4c1` (repair: Make removenode safe by default), removenode was changed to use repair based node operations unconditionally. Since repair based node operations is not enabled by default, we should respect the flag to use stream to sync data if the flag is false. Fixes #8700	2021-05-25 10:42:58 +08:00
Avi Kivity	4383674760	cql3: result_set: switch rows to chunked_vector _rows uses a deque, but doesn't need any special functionality. Switch to chunked_vector, which uses one less allocation in the common case (std::deque has an extra allocation for managing its chunks). Closes #8679	2021-05-20 20:14:15 +03:00
Avi Kivity	eac6fb8d79	gdb: bypass unit test on non-x86 The gdb self-tests fail on aarch64 due to a failure to use thread-local variables. I filed [1] so it can get fixed. Meanwhile, disable the test so the build passes. It is sad, but the aarch64 build is not impacted by these failures. [1] https://sourceware.org/bugzilla/show_bug.cgi?id=27886 Closes #8672	2021-05-20 20:14:15 +03:00
Asias He	2ec1f719de	repair: Always use run_replace_ops Currently, the new NODE_OPS_CMD for replace operation is used only when repair based node operation is enabled. However, We can use the NODE_OPS_CMD to run replace operation and use streaming instead of repair to sync data as well. After this patch, we will use streaming inside run_replace_ops if repair based node ops is not enabled. So that we can take the benefits that NODE_OPS_CMD brings in commit `323f72e48a` (repair: Switch to use NODE_OPS_CMD for replace operation). Fixes #8013	2021-05-20 20:14:15 +03:00
Avi Kivity	bb51f7d928	Update seastar submodule * seastar 847fccaf5e...28dddd2683 (13): > reactor: disable xfs extent size hints if using the kernel page cache > smp: replace _reactors global with a local > Merge "Add test for IO-scheduler (fails now)" from Pavel E > weak_ptr: lift restriction on copying > core: expose hidden method from parent class > perftune.py: __get_feature_file(): verify that parameters are not None > gate: assert no outstanding requests when destroyed > httpd: add status_types > cmake: use -O2 for CMAKE_CXX_FLAGS_DEV with clang > compat: source_location: use std::source_location only if available > iotune: disambiguate "this" lambda capture in C++20 mode > Merge "Consider disk saturation request lengths" from Pavel E > Merge 'seastar-addr2line: support oneline backtrace in resolve call' from Benny Halevy	2021-05-20 20:14:15 +03:00
Benny Halevy	5724233609	scylla-gdb: scylla_io_queues: support io_group._max_bytes_count _maximum_request_size is renamed to _max_bytes_count in `40a29d5590` This patch adds support for ioq io_group._max_bytes_count if io_group._maximum_request_size isn't found. Test: scylla-gdb(release) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210520151537.710123-1-bhalevy@scylladb.com>	2021-05-20 20:14:15 +03:00
Avi Kivity	30034371e7	Merge "Remove most of global pointers from repair" from Pavel " There are many global stuff in repair -- a bunch of pointers to sharded services, tracker, map of metas (maybe more). This set removes the first group, all those services had become main-local recently. Along the way a call to global storage proxy is dropped. To get there the repair_service is turned into a "classical" sharded<> service, gets all the needed dependencies by references from main and spreads them internally where needed. Tracker and other stuff is left global, but tracker is now the candidate for merging with the now sharded repair_service, since it emulates the sharded concept internally. Overall the change is - make repair_service sharded and put all dependencies on it at start - have sharded<repair_service> in API and storage service - carry the service reference down to repair_info and repair_meta constructions to give them the depedencies - use needed services in _info and _meta methods tests: unit(dev), dtest.repair(dev) " * 'br-repair-service' of https://github.com/xemul/scylla: (29 commits) repair: Drop most of globals from repair repair: Use local references in messaging handler checks repair: Use local references in create_writer() repair: Construct repair_meta with local references repair: Keep more stuff on repair_info repair: Kill bunch of global usages from insert_repair_meta repair: Pass repair service down to meta insertion repair: Keep local migration manager on repair_info repair: Move unused db captures repair: Remove unused ms captures repair: Construct repair_info with service repair: Loop over repair sharded container repair: Make sync_data_using_repair a method repair: Use repair from storage service repair: Keep repair on storage service repair: Make do_repair_start a method repair: Pass repair_service through the API until do_repair_start repair: Fix indentation after previous patch repair: Split sync_data_using_repair repair: Turn repair_range a repair_info method ...	2021-05-20 10:57:48 +03:00
Piotr Sarna	223a59c09c	test: make rjson allocator test working in sanitize mode Following Nadav's advice, instead of ignoring the test in sanitize/debug modes, the allocator simply has a special path of failing sufficiently large allocation requests. With that, a problem with the address sanitizer is bypassed and other debug mode sanitizers can inspect and check if there are no more problems related to wrapping the original rapidjson allocator. Closes #8539	2021-05-20 00:42:47 +03:00
Avi Kivity	c71d007797	consistency_level: deinline assure_sufficient_live_nodes() assure_sufficient_live_nodes() is a huge template calling other huge templates, and requires "network_topology_strategy.hh". It is inlined in consistency_level.hh. This increases compile time and recompiles. Move the template out-of-line and use "extern template" to instantiate it. This is not ideal as new callers would require updates to the instantiated signatures, but I think our goal should be to de-template it completely instead. Meanwhile, this reduces some pain. Ref #1. Closes #8637	2021-05-19 15:03:51 +03:00
Avi Kivity	d8121961fa	Merge 'cql-pytest: add nodetool flush feature and use it in a test' from Nadav Har'El The first patch adds a nodetool-like capability to the cql-pytest framework. It is not meant to be used to test nodetool itself, but rather to give CQL tests the ability to use nodetool operations - currently only one operation - "nodetool flush". We try to use Scylla's REST API, if possible, and only fall back to using an external "nodetool" command when the REST API is not available - i.e., when testing Cassandra. The benefit of using the REST API is that we don't need to run the jmx server to test Scylla. The second patch is an example of using the new nodetool flush feature in a test that needs to flush data to reproduce a bug (which has already been fixed). Closes #8622 * github.com:scylladb/scylla: cql-pytest: reproducer for issue #8138 cql-pytest: add nodetool flush feature	2021-05-19 14:40:18 +03:00
Nadav Har'El	fd8d15a1a6	cql-pytest: reproducer for issue #8138 We add a reproducing test for issue #8138, were if we write to an TWCS table, scanning it would yield no rows - and worse - crash the debug build. This test requires "nodetool flush" to force the read to happen from sstables, hence the nodetool feature was implemented in the previous patch (on Scylla, it uses the REST API - not actually running nodetool or requiring JMX). Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2021-05-19 13:58:14 +03:00
Nadav Har'El	49580a4701	cql-pytest: add nodetool flush feature This patch adds a nodetool-compatible capability to the cql-pytest framework. It is not meant to be used to test nodetool itself, but rather to give CQL tests the ability to use nodetool operations - currently one operation - "nodetool flush". Use it in a test as: import nodetool nodetool.flush(cql, table) I chose a functional API with parameters ("cql") instead of a fixture with an implied connection so that in the future we may allow multiple multiple nodes and this API will allow sending nodetool requests to different nodes. However, multi-node support is not implemented yet, nor used in any of the existing tests. The implementation uses Scylla's REST API if available, or if not, falls back to using an external "nodetool" command (which can be overridden using the NODETOOL environment variable). This way, both cql-pytest/run (Scylla) and cql-pytest/run-cassandra (Cassandra) now correctly support these nodetool operations, and we still don't need to run JMX to test Scylla. The reason We want to support nodetool.flush() is to reproduce bugs that depend on data reaching disk. We already had such a reproducer in test_large_cells_rows.py - it too did something similar - but it was Scylla-only (using only the REST API). Instead of copying such code to multiple places, we better have a common nodetool.flush() function, as done in this patch. The test in test_large_cells_rows.py can later be changed to use the new function. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2021-05-19 13:55:25 +03:00
Avi Kivity	794d272e35	Merge "Refine allocation strategy" from Pavel E " This set does two things: - hides migrate-fn machinery in allocation_strategy header - conceptualizes dynamic objects The former is possible after IMR rework -- nowadays users of LSA don't need to do anything special with "migrators" so they can be turned to be internal allocation-strategy helpers. The latter is to make sure dynamic objects do not forget to overload the size_for_allocation_strategy(). If this happens the whole thing compiles fine and sometimes works, but generates memory corruptions, so it's worth adding more confidence here. tests: unit(dev) " * 'br-lsa-hide-migrators' of https://github.com/xemul/scylla: bptree: Require dynamic object for nodes reconstruct allocation_strategy, code: Conceptualize dynamic objects allocation_strategy: Hide migrators allocation_strategy, code: Simplify alloc() allocation_strategy: Mark size_for_allocation_strategy noexcept	2021-05-19 10:14:51 +03:00
Pavel Emelyanov	0c4ba56594	bptree: Require dynamic object for nodes reconstruct The B+ tree is not intrusive and supports both kinds of objects -- dynamic (in sense of previous patch) and fixed-size. Respectively, the nodes provide .storage_size() method and get the embedded object storage size themselves. Thus, if a dynamic object is used with the tree but it misses the .storage_size() itself this would come unnoticed. Fortunately, dynamic objects use the .reconstruct() method, so the method should be equipeed with the DybnamicObject concept. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-05-19 09:23:49 +03:00
Pavel Emelyanov	9216a5bc08	allocation_strategy, code: Conceptualize dynamic objects Usually lsa allocation is performed with the construct() helper that allocates a sizeof(T) slot and constructs it in place. Some rare objects have dynamic size, so they are created by alloc()ating a slot of some specific size and (!) must provide the correct overload of size_for_allocation_strategy that reports back the relevant storage size. This "must provide" is not enforced, if missed a default sizer would be instantiated, but won't work properly. This patch makes all users of alloc() conform to DynamicObject concept which requires the presense of .storage_size() method to tell that size. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-05-19 09:23:49 +03:00
Pavel Emelyanov	b8a4f32b48	allocation_strategy: Hide migrators After IMR rework the only lsa-migrating functionality is standard one that calls move constructors on lsa slots. Hide the whole thing inside allocation-strategy. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-05-19 09:23:49 +03:00
Pavel Emelyanov	28f01aadc9	allocation_strategy, code: Simplify alloc() Todays alloc() accepts migrate-fn, size and alignment. All the callers don't really need to provide anything special for the migrate-fn and are just happy with default alignof() for alignment. The simplification is in providing alloc() that only accepts size arg and does the rest itself. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-05-19 09:23:49 +03:00
Pavel Emelyanov	fdfcda97d7	allocation_strategy: Mark size_for_allocation_strategy noexcept Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-05-19 09:23:49 +03:00
Botond Dénes	dbb6851d4d	test/manual/sstable_scan_footprint: don't double close the semaphore The semaphore `stats_collector` references is the one obtained from the database object, which is already stopped by `database::stop()`, making the stop in `~stats_collector()` redundant, and even worse, as it triggers an assert failure. Remove it. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20210518140913.276368-1-bdenes@scylladb.com>	2021-05-18 17:55:52 +03:00
Avi Kivity	16ff92745f	Merge 'perf: add alternator frontend to perf_simple_query' from Piotr Sarna The perf_simple_query tool is extended with another protocol aside from CQL - alternator. The alternative (pun intended) benchmark can be executed by using the `--alternator X` parameter, where X specifies one of the alternator's mandatory write isolation options: - "forbid_rmw" - forbids RMW (read-modify-write) requests - "unsafe" - never uses LWT (lightweight transactions), even for RMW - "always_use_lwt" - uses LWT even for non-RMW requests - "only_rmw_uses_lwt" - that one's rather self-explanatory Alternator cooperates with existing `--write` and `--delete` parameters. Aside from being able to check for improvements/regressions in the alternator module, it's also possible to check how different isolation levels influence the number of allocations and overall performance, or to compare alternator against CQL. Example output showing the difference in isolation levels: ```bash $ ./build/release/test/perf/perf_simple_query_g --smp 1 \ --write --alternator only_rmw_uses_lwt --default-log-level error random-seed=1235000092 Started alternator executor 10873.76 tps (202.9 allocs/op, 12.4 tasks/op, 369921 insns/op) 11096.09 tps (202.7 allocs/op, 12.1 tasks/op, 374792 insns/op) 11100.09 tps (203.0 allocs/op, 12.1 tasks/op, 376469 insns/op) 11068.98 tps (203.1 allocs/op, 12.1 tasks/op, 377132 insns/op) 11081.24 tps (203.2 allocs/op, 12.1 tasks/op, 377290 insns/op) median 11081.24 tps (203.2 allocs/op, 12.1 tasks/op, 377290 insns/op) median absolute deviation: 14.85 maximum: 11100.09 minimum: 10873.76 $ ./build/release/test/perf/perf_simple_query_g --smp 1 \ --random-seed 1235000092 --write --alternator always_use_lwt \ --default-log-level error random-seed=1235000092 Started alternator executor 3605.35 tps (877.4 allocs/op, 174.6 tasks/op, 986666 insns/op) 3555.71 tps (890.0 allocs/op, 174.4 tasks/op, 1006945 insns/op) 3530.20 tps (899.7 allocs/op, 174.1 tasks/op, 1021908 insns/op) 3437.65 tps (908.2 allocs/op, 174.6 tasks/op, 1033992 insns/op) 3409.88 tps (913.2 allocs/op, 174.4 tasks/op, 1041240 insns/op) median 3530.20 tps (899.7 allocs/op, 174.1 tasks/op, 1021908 insns/op) median absolute deviation: 75.15 maximum: 3605.35 minimum: 3409.88 ``` Closes #8656 * github.com:scylladb/scylla: perf: add alternator frontend to perf_simple_query cdc: make metadata.hh self-sufficient test: add minimal alternator_test_env	2021-05-18 16:17:54 +03:00
Piotr Sarna	6c6ccda8a0	perf: add alternator frontend to perf_simple_query The perf_simple_query tool is extended with another protocol aside from CQL - alternator. The alternative (pun intended) benchmark can be executed by using the `--alternator X` parameter, where X specifies one of the alternator's mandatory write isolation options: - "forbid_rmw" - forbids RMW (read-modify-write) requests - "unsafe" - never uses LWT (lightweight transactions), even for RMW - "always_use_lwt" - uses LWT even for non-RMW requests - "only_rmw_uses_lwt" - that one's rather self-explanatory Alternator cooperates with existing --write and --delete parameters. Aside from being able to check for improvements/regressions in the alternator module, it's also possible to check how different isolation levels influence the number of allocations and overall performance, or to compare alternator against CQL. $ ./build/release/test/perf/perf_simple_query_g --smp 1 \ --write --alternator only_rmw_uses_lwt --default-log-level error random-seed=1235000092 Started alternator executor 10873.76 tps (202.9 allocs/op, 12.4 tasks/op, 369921 insns/op) 11096.09 tps (202.7 allocs/op, 12.1 tasks/op, 374792 insns/op) 11100.09 tps (203.0 allocs/op, 12.1 tasks/op, 376469 insns/op) 11068.98 tps (203.1 allocs/op, 12.1 tasks/op, 377132 insns/op) 11081.24 tps (203.2 allocs/op, 12.1 tasks/op, 377290 insns/op) median 11081.24 tps (203.2 allocs/op, 12.1 tasks/op, 377290 insns/op) median absolute deviation: 14.85 maximum: 11100.09 minimum: 10873.76 $ ./build/release/test/perf/perf_simple_query_g --smp 1 \ --random-seed 1235000092 --write --alternator always_use_lwt \ --default-log-level error random-seed=1235000092 Started alternator executor 3605.35 tps (877.4 allocs/op, 174.6 tasks/op, 986666 insns/op) 3555.71 tps (890.0 allocs/op, 174.4 tasks/op, 1006945 insns/op) 3530.20 tps (899.7 allocs/op, 174.1 tasks/op, 1021908 insns/op) 3437.65 tps (908.2 allocs/op, 174.6 tasks/op, 1033992 insns/op) 3409.88 tps (913.2 allocs/op, 174.4 tasks/op, 1041240 insns/op) median 3530.20 tps (899.7 allocs/op, 174.1 tasks/op, 1021908 insns/op) median absolute deviation: 75.15 maximum: 3605.35 minimum: 3409.88	2021-05-18 15:10:31 +02:00
Piotr Sarna	6e28c01c53	cdc: make metadata.hh self-sufficient The header relies on topology_description class definition, which is part of cdc/generation.hh.	2021-05-18 15:10:31 +02:00
Piotr Sarna	b6d6247a74	test: add minimal alternator_test_env A minimal implementation of alternator test env, a younger cousin of cql_test_env, is implemented. Note that using this environment for unit tests is strongly discouraged in favor of the official test/alternator pytest suite. Still, alternator_test_env has its uses for microbenchmarks.	2021-05-18 15:10:31 +02:00
Takuya ASADA	a3b25e3d29	unified/uninstall.sh: simplify uninstall.sh, delete all files correctly Current uninstall.sh is trying to do similar logic with install.sh, but it makes script larger meaninglessly, and also it failing to remove few files under /opt/scylladb. Let's just do rm -rf /opt/scylladb, and drop few other files located out side of /opt/scylladb. Closes #8662	2021-05-18 14:55:18 +02:00
Asias He	0858619cba	storage_service: Abort restore_replica_count when node is removed from the cluster Consider the following procedure: - n1, n2, n3 - n3 is down - n1 runs nodetool removenode uuid_of_n3 to removenode from n3 the cluster - n1 is down in the middle of removenode operation Node n1 will set n3 to removing gossip status during removenode operation. Whenever existing nodes learn a node is in removing gossip status, they will call restore_replica_count to stream data from other nodes for the ranges n3 loses if n3 was removed from the cluster. If the streaming fails, the streaming will sleep and retry. The current max number of retry attempts is 5. The sleep interval starts at 60 seconds and increases 1.5 times per sleep. This can leave the cluster in a bad state. For example, nodes can go out of disk space if the streaming continues. We need a way to abort such streaming attempts. To abort the removenode operation and forcely remove the node, users can run `nodetool removenode force` on any existing nodes to move the node from removing gossip status to removed gossip status. However, the restore_replica_count will not be aborted. In this patch, a status checker is added in restore_replica_count, so that once a node is in removed gossip status, restore_replica_count will be aborted. This patch is for older releases without the new NODE_OPS_CMD infrastructure where such abort will happen automatically in case of error. Fixes #8651 Closes #8655	2021-05-18 14:55:18 +02:00
Botond Dénes	82bff1bcc6	test: cql_test_env: use proper scheduling groups Currently `cql_test_env` runs its `func` in the default (main) group and also leaves all scheduling groups in `dbcfg` default initialized to the same scheduling group. This results in every part of the system, normally isolated from each other, running in the same (default) scheduling group. Not a big problem on its own, as we are talking about tests, but this creates an artificial difference between the test and the real environment, which is ever more pronounced since certain query parameters are selected based on the current scheduling group. To bring cql test env just that little bit closer to the real thing, this patch creates all the scheduling groups main does (well almost) and configures `dbcfg` with them. Creating and destroying the scheduling group on each setup-teardown of cql test env breaks some internal seastar components which don't like seeing the same scheduling group with the same name but different id. So create the scheduling groups once on first access and keep them around until the test executable is running. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20210514141614.128213-2-bdenes@scylladb.com>	2021-05-18 13:44:54 +03:00
Botond Dénes	300ee974f7	test: use with_cql_test_env_thread where needed Currently `with_cql_test_env()` is equivalent to `with_cql_test_env_thread()`, which resulted in many tests using the former while really needing the latter and getting away with it. This equivalence is incidental and will go away soon, so make sure all cql test env using tests that expect to be run in a thread use the appropriate variant. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20210514141614.128213-1-bdenes@scylladb.com>	2021-05-18 13:44:52 +03:00
Avi Kivity	6db826475d	Merge "Introduce segregate scrub mode" from Botond " The current scrub compaction has a serious drawback, while it is very effective at removing any corruptions it recognizes, it is very heavy-handed in its way of repairing such corruptions: it simply drops all data that is suspected to be corrupt. While this is the safest way to cleanse data, it might not be the best way from the point of view of a user who doesn't want to loose data, even at the risk of retaining some business-logic level corruption. Mind you, no database-level scrub can ever fully repair data from the business-logic point of view, they can only do so on the database-level. So in certain cases it might be desirable to have a less heavy-handed approach of cleansing the data, that tries as hard as it can to not loose any data. This series introduces a new scrub mode, with the goal of addressing this use-case: when the user doesn't want to loose any data. The new mode is called "segregate" and it works by segregating its input into multiple outputs such that each output contains a valid stream. This approach can fix any out-of-order data, be that on the partition or fragment level. Out-of-order partitions are simply written into a separate output. Out of order fragments are handled by injecting a partition-end/partition-start pair right before them, so that they are now in a separate (duplicate) partition, that will just be written into a separate output, just like a regular out-of-order partition. The reason this series is posted as an RFC is that although I consider the code stable and tested, there are some questions related to the UX. * First and foremost every scrub that does more than just discard data that is suspected to be corrupt (but even these a certain degree) have to consider the possibility that they are rehabilitating corruptions, leaving them in the system without a warning, in the sense that the user won't see any more problems due to low-level corruptions and hence might think everything is alright, while data is still corrupt from the business logic point of view. It is very hard to draw a line between what should and shouldn't scrub do, yet there is a demand from users for scrub that can restore data without loosing any of it. Note that anybody executing such a scrub is already in a bad shape, even if they can read their data (they often can't) it is already corrupt, scrub is not making anything worse here. * This series converts the previous `skip_corrupted` boolean into an enum, which now selects the scrub mode. This means that `skip_corrupted` cannot be combined with segregate to throw out what the former can't fix. This was chosen for simplicity, a bunch of flags, all interacting with each other is very hard to see through in my opinion, a linear mode selector is much more so. * The new segregate mode goes all-in, by trying to fix even fragment-level disorder. Maybe it should only do it on the partition level, or maybe this should be made configurable, allowing the user to select what to happen with those data that cannot be fixed. Tests: unit(dev), unit(sstable_datafile_test:debug) " * 'sstable-scrub-segregate-by-partition/v1' of https://github.com/denesb/scylla: test: boost/sstable_datafile_test: add tests for segregate mode scrub api: storage_service/keyspace_scrub: expose new segregate mode sstables: compaction/scrub: add segregate mode mutation_fragment_stream_validator: add reset methods mutation_writer: add segregate_by_partition api: /storage_service/keyspace_scrub: add scrub mode param sstables: compaction/scrub: replace skip_corrupted with mode enum sstables: compaction/scrub: prevent infinite loop when last partition end is missing tests: boost/sstable_datafile_test: use the same permit for all fragments in scrub tests	2021-05-18 13:43:01 +03:00
Botond Dénes	5eb4517f56	read_context: move_to_next_partition(): make reader creation atomic Otherwise an interleaving cache update can clear the `_prev_snapshot` before the reader is created, leading to the reader being created via a null mutation source. Tests: unit(dev, release, debug:row_cache_test) Fixes #8671. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20210518092317.227433-1-bdenes@scylladb.com>	2021-05-18 13:41:48 +03:00
Piotr Sarna	c8653d1321	cql3: enhance the fix for index paging type check The original fix stripped the reversed type only from the base table column, but it's better to be safe than sorry, so the reverse is also stripped from the view column. Refs #8667 Message-Id: <cb5dedb0b8b6b5eea3a69863ae50a0e906482665.1621330463.git.sarna@scylladb.com>	2021-05-18 12:47:35 +03:00
Takuya ASADA	60c0b37a4c	install.sh: apply correct file security context when copying files Currently, unified installer does not apply correct file security context while copying files, it causes permission error on scylla-server.service. We should apply default file security context while copying files, using '-Z' option on /usr/bin/install. Also, because install -Z requires normalized path to apply correct security context, use 'realpath -m <PATH>' on path variables on the script. Fixes #8589 Closes #8602	2021-05-18 12:09:51 +03:00
Takuya ASADA	6faa8b97ec	install.sh: fix not such file or directory on nonroot Since we have added scylla-node-exporter, we needed to do 'install -d' for systemd directory and sysconfig directory before copying files. Fixes #8663 Closes #8664	2021-05-18 12:03:45 +03:00
Avi Kivity	593ad4de1e	Merge 'Fix type checking in index paging' from Piotr Sarna When recreating the paging state from an indexed query, a bunch of panic checks were introduced to make sure that the code is correct. However, one of the checks is too eager - namely, it throws an error if the base column type is not equal to the view column type. It usually works correctly, unless the base column type is a clustering key with DESC clustering order, in which case the type is actually "reversed". From the point of view of the paging state generation it's not important, because both types deserialize in the same way, so the check should be less strict and allow the base type to be reversed. Tests: unit(release), along with the additional test case introduced in this series; the test also passes on Cassandra Fixes #8666 Closes #8667 * github.com:scylladb/scylla: test: add a test case for paging with desc clustering order cql3: relax a type check for index paging	2021-05-18 11:34:59 +03:00
Kamil Braun	03ad111beb	tree-wide: comments on deprecated functions to access global variables Closes #8665	2021-05-18 11:31:10 +03:00
Botond Dénes	ae366868fb	multishard_mutation_query: save_reader(): avoid round-trip for destroying rparts Force its destruction when saving the reader. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20210514140844.119362-1-bdenes@scylladb.com>	2021-05-18 10:07:13 +03:00
Botond Dénes	c98b0d0de8	test: cql_test_env: add trace logs to execute_cql() In tests executing tons of these, it is useful to be able to enable a trace logging of each one, to see which is the last successful one. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20210514140531.118390-1-bdenes@scylladb.com>	2021-05-18 10:06:22 +03:00
Piotr Sarna	c36f432423	test: add a test case for paging with desc clustering order Issue #8666 revealed an issue with validating types for paged indexed queries - namely, the type checking mechanism is too strict in comparing types and fails on mismatched clustering order - e.g. an `int` column type is different from `int` with DESC clustering order. As a result, users see a very confusing message (because reversed types are printed as their underlying type): > Mismatched types for base and view columns c: int and int This test case fails before the fix for #8666 and thus acts as a regression test.	2021-05-17 17:06:50 +02:00
Piotr Sarna	544ef2caf3	cql3: relax a type check for index paging When recreating the paging state from an indexed query, a bunch of panic checks were introduced to make sure that the code is correct. However, one of the checks is too eager - namely, it throws an error if the base column type is not equal to the view column type. It usually works correctly, unless the base column type is a clustering key with DESC clustering order, in which case the type is actually "reversed". From the point of view of the paging state generation it's not important, because both types deserialize in the same way, so the check should be less strict and allow the base type to be reversed. Tests: unit(release), along with the additional test case introduced in this series; the test also passes on Cassandra Fixes #8666	2021-05-17 17:06:50 +02:00
Botond Dénes	dca808dd51	perf/perf_simple_query: add --enable-cache option Allowing for testing performance with/out cache. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20210517045402.16153-1-bdenes@scylladb.com>	2021-05-17 14:06:18 +02:00
Raphael S. Carvalho	10ae77966c	compaction_manager: Don't swallow exception in procedure used by reshape and resharding run_custom_job() was swallowing all exceptions, which is definitely wrong because failure in a resharding or reshape would be incorrectly interpreted as success, which means upper layer will continue as if everything is ok. For example, ignoring a failure in resharding could result in a shared sstable being left unresharded, so when that sstable reaches a table, scylla would abort as shared ssts are no longer accepted in the main sstable set. Let's allow the exception to be propagated, so failure will be communicated, and resharding and reshape will be all or nothing, as originally intended. Fixes #8657. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210515015721.384667-1-raphaelsc@scylladb.com>	2021-05-17 13:57:05 +02:00
Avi Kivity	8d6e575f59	perf_fast_forward: report instructions per fragment Use a hardware counter to report instructions per fragment. Results vary from ~4k insns/f when reading sequentially to more than 1M insns/f. Instructions per fragment can be a more stable metric than frags/sec. It would probably be even more stable with a fake file implementation that works in-memory to eliminate seastar polling instruction variation. Closes #8660	2021-05-17 11:33:24 +02:00
Tomasz Grabiec	8dddfab5db	Merge 'db/virtual tables: Add infrastructure + system.status example table' from Piotr Wojtczak This is the 1st PR in series with the goal to finish the hackathon project authored by @tgrabiec, @kostja, @amnonh and @mmatczuk (improved virtual tables + function call syntax in CQL). Virtual tables created within this framework are "materialized" in memtables, so current solution is for small tables only. As an example system.status was added. It was checked that DISTINCT and reverse ORDER BY do work. This PR was created by @jul-stas and @StarostaGit Fixes #8343 This is the same as #8364, but with a compilation fix (newly added `close()` method was not implemented by the reader) Closes #8634 * github.com:scylladb/scylla: boost/tests: Add virtual_table_test for basic infrastructure boost/tests: Test memtable_filling_virtual_table as mutation_source db/system_keyspace: Add system.status virtual table db/virtual_table: Add a way to specify a range of partitions for virtual table queries. db/virtual_table: Introduce memtable_filling_virtual_table db: Add virtual tables interface db: Introduce chained_delegating_reader	2021-05-17 11:29:37 +02:00
Botond Dénes	5e39cedbe3	evictable_reader: remove _reader_created flag This flag is not really needed, because we can just attempt a resume on first use which will fail with the default constructed inactive read handle and the reader will be created via the recreate-after-evicted path. This allows the same path to be used for all reader creation cases, simplifying the logic and more importantly making further patching easier without the special case. To make the recreate path (almost) as cheap for the first reader creation as it was with the special path, `_trim_range_tombstones` and `_validate_partition_key` is only set when really needed. Tests: unit(dev) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20210514141511.127735-1-bdenes@scylladb.com>	2021-05-16 14:45:46 +03:00
Botond Dénes	3b57106627	evictable_reader: remove destructor We now have close() which is expected to clean up, no need for cleanup in the destructor and consequently a destructor at all. Message-Id: <20210514112349.75867-1-bdenes@scylladb.com>	2021-05-16 12:19:41 +03:00
Benny Halevy	f4cfa530cc	perf: enable instructions_retired_counter only once per executor::run Enabling it for each run_worker call will invoke ioctl PERF_EVENT_IOC_ENABLE in parallel to other workers running and this may skew the results. Test: perf_simple_query Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210514130542.301168-1-bhalevy@scylladb.com>	2021-05-16 12:13:27 +03:00
Pavel Emelyanov	0068988e81	repair: Drop most of globals from repair No code left that uses these globals, so rip them altogether. Also drop the former messaging init/uninit methods that are now only setting up those globals. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-05-14 18:44:02 +03:00
Pavel Emelyanov	315698c683	repair: Use local references in messaging handler checks Some time ago checks for sys-dist-ks and view-update-generator to be locally initalized were moved inside the repair service message handlers. Now everything is ready to use service's reference instead of global pointers. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-05-14 18:44:02 +03:00
Pavel Emelyanov	e748e16352	repair: Use local references in create_writer() The repair_writer::create_writer() method needs sys-dist-ks and view-update-generator. It's only called from repair_meta which already has both. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-05-14 18:44:02 +03:00
Pavel Emelyanov	394acdc139	repair: Construct repair_meta with local references The repair_meta needs sys-dist-ks and view-update-generator. Now when it's created both are available. Once from the repair-service and another time from the repair_info. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-05-14 18:44:02 +03:00

1 2 3 4 5 ...

26562 Commits