scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-03 21:47:10 +00:00

Author	SHA1	Message	Date
Pavel Emelyanov	59aec1f300	database: Don't break namespace withexternal alias The namespace replica is broken in the middle with sstable_list alias, while the latter can be declared earlier Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#18664	2024-05-14 16:45:20 +03:00
Piotr Dulikowski	9ab57b12bb	Merge 'cql/describe: hide cdc log tables' from Michał Jadwiszczak Currently all tables are printed in statements like `DESC TABLES`, `DESC KEYSPACE ks` or `DESC SCHEMA`. But when we create a table with cdc enabled, additional table with `_scylla_cdc_log` suffix is created. Those tables shouldn't be recreated manually but created automatically when the base table is created. This patch hides tables with `_scylla_cdc_log` suffix in all describe statements. To preserve properties values of those tables, `ALTER TABLE` statement with all properties and their current values for log cdc table is added to description of the base table. Fixes #18459 Closes scylladb/scylladb#18467 * github.com:scylladb/scylladb: test/cql-pytest/test_describe: add test for hiding cdc tables cql3/statements/describe_statement: hide cdc tables schema: add a method to generate ALTER statement with all properties schema: extract schema's properties generation	2024-05-14 15:02:29 +02:00
Botond Dénes	a15a9c3e8d	Merge 'utils: chunked_vector: fill ctor: make exception safe' from Benny Halevy Currently, if the fill ctor throws an exception, the destructor won't be called, as it object is not fully constructed yet. Call the default ctor first (which doesn't throw) to make sure the destructor will be called on exception. Fixes scylladb/scylladb#18635 - [x] Although the fixes is for a rare bug, it has very low risk and so it's worth backporting to all live versions Closes scylladb/scylladb#18636 * github.com:scylladb/scylladb: chunked_vector_test: add more exception safety tests chunked_vector_test: exception_safe_class: count also moved objects utils: chunked_vector: fill ctor: make exception safe	2024-05-14 13:35:02 +03:00
Piotr Dulikowski	448f651049	Merge 'hinted handoff: Prevent segmentation fault when initializing endpoint managers ' from Dawid Mędrek We don't attempt to create an endpoint manager for a hint directory if there is no mapping host ID–IP corresponding to the directory's name, an IP address. That prevents a segmentation fault. Fixes scylladb/scylladb#18649 Closes scylladb/scylladb#18650 * github.com:scylladb/scylladb: db/hints: Remove an unused header db/hints: Remove migrating flag before initializing endpoint managers db/hints: Prevent segmentation fault when initializing endpoint managers	2024-05-14 07:34:16 +02:00
Amnon Heiman	0c84692c97	replica/table.cc: Add metrics per-table-per-node This patch adds metrics that will be reported per-table per-node. The added metrics (that are part of the per-table per-shard metrics) are: scylla_column_family_cache_hit_rate scylla_column_family_read_latency scylla_column_family_write_latency scylla_column_family_live_disk_space Fixes #18642 Signed-off-by: Amnon Heiman <amnon@scylladb.com> Closes scylladb/scylladb#18645	2024-05-14 07:54:34 +03:00
Raphael S. Carvalho	0b2ec3063c	sstables: Fix incremental_reader_selector (for range reads) with tablets incremental_reader_selector is the mechanism for incremental comsumption of disjoint sstables on range reads. tablet_sstable_set was implemented, such that selector is efficient with tablets. The problem is selector is vnode addicted and will only consider a given set exhausted when maximum token is reached. With tablets, that means a range read on first tablet of a given shard will also consume other tablets living in the same shard. That results in combined reader having to work with empty sstable readers of tablets that don't intersect with the range of the read. It won't cause extra I/O because the underlying sstables don't intersect with the range of the read. It's only unnecessary CPU work, as it involves creating readers (= allocation), feeding them into combined reader, which will in turn invoke the sstable readers only to realize they don't have any data for that range. With 100k tablets (ranges), and 100 tablets per shard, and ~5 sstables per tablet, there will be this amount of readers (empty or not): (100k * ((100^2 + 100) / 2) * avg_sstable_per_tablet=5) = ~2.5 billions. ~5000 times more readers, it can be quite significant additional cpu work, even though I/O dominates the most in scans. It's an inefficiency that we rather get rid of. The behavior can be observed from logs (there's 1 sstable for each of 4 tablets, but note how readers are created for every single one of them when reading only 1 tablet range): ``` table - make_reader_v2 - range=(-inf, {-4611686018427387905, end}] incremental_reader_selector - create_new_readers(null): selecting on pos {minimum token, w=-1} sstable - make_reader - reader on (-inf, {-4611686018427387905, end}] for sst 3gfx_..._34qn42... that has range [{-9151620220812943033, start},{-4813568684827439727, end}] incremental_reader_selector - create_new_readers(null): selecting on pos {-4611686018427387904, w=-1} sstable - make_reader - reader on (-inf, {-4611686018427387905, end}] for sst 3gfx_..._368nk2... that has range [{-4599560452460784857, start},{-78043747517466964, end}] incremental_reader_selector - create_new_readers(null): selecting on pos {0, w=-1} sstable - make_reader - reader on (-inf, {-4611686018427387905, end}] for sst 3gfx_..._38lj42... that has range [{851021166589397842, start},{3516631334339266977, end}] incremental_reader_selector - create_new_readers(null): selecting on pos {4611686018427387904, w=-1} sstable - make_reader - reader on (-inf, {-4611686018427387905, end}] for sst 3gfx_..._3dba82... that has range [{5065088566032249228, start},{9215673076482556375, end}] ``` Fix is about making sure the tablet set won't select past the supplied range of the read. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb/scylladb#18556	2024-05-14 07:43:22 +03:00
Pavel Emelyanov	bb1696910c	Merge 'scylla-nodetool: make documentation links product and version dependant' from Botond Dénes Currently, all documentation links that feature anywhere in the help output of scylla-nodetool, are hard-coded to point to the documentation of the latest stable release. As our documentation is version and product (open-source or enterprise) specific, this is not correct. This PR addresses this, by generating documentation links such that they point to the documentation appropriate for the product and version of the scylladb release. Fixes: https://github.com/scylladb/scylladb/issues/18276 - [x] the native nodetool is a new feature, no backport needed Closes scylladb/scylladb#18476 * github.com:scylladb/scylladb: tools/scylla-nodetool: make doc link version-specific release: introduce doc_link() build: pass scylla product to release.cc	2024-05-13 18:03:45 +03:00
Botond Dénes	d82a31f15f	service/storage_proxy: add useful version of base write throttle metrics There are two metrics to help observe base-write throttling: * current_throttled_base_writes * last_mv_flow_control_delay Both show a snapshot of what is happening right at the time of querying these metrincs. This doesn't work well when one wants to investigate the role throttling is playing in occasional write timeouts.s Prometheus scrapes metrics in multi-second intervals, and the probability of that instant catching the throttling at play is very small (almost zero). Add two new metrics: * throttled_base_writes_total * mv_flow_control_delay_total These accumulate all values, allowing graphana to derive the values and extract information about throttle events that happened in the past (but not necessarily at the instant of the scrape). Note that dividing the two values, will yield the average delay for a throttle, which is also useful. Closes scylladb/scylladb#18435	2024-05-13 18:02:06 +03:00
Dawid Medrek	ef8f14d44b	db/hints: Remove an unused header	2024-05-13 16:40:47 +02:00
Dawid Medrek	c9bbb92b1a	db/hints: Remove migrating flag before initializing endpoint managers Before these changes, if initializing endpoint managers after the migration of hinted handoff to host ID is done throws an exception, we don't remove the flag indicating the migration is still in progress. However, the migration has, in practice, finished -- all of the hint directories have been mapped to host IDs and all of the nodes in the cluster are host-ID-based. Because of that, it makes sense to remove the flag early on.	2024-05-13 16:40:47 +02:00
Dawid Medrek	bdcde0c210	db/hints: Prevent segmentation fault when initializing endpoint managers If hinted handoff is still IP-based and there is a hint directory representing an IP without a corresponding mapping to a host ID in `locator::token_metadata`, an attemp to initialize its endpoint manager will result in a segmentation fault. This commit prevents that.	2024-05-13 16:40:47 +02:00
Benny Halevy	4bbb66f805	chunked_vector_test: add more exception safety tests For insertion, with and without reservation, and for fill and copy constructors. Reproduces https://github.com/scylladb/scylladb/issues/18635 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-05-13 17:18:38 +03:00
Benny Halevy	88b3173d03	chunked_vector_test: exception_safe_class: count also moved objects We have to account for moved objects as well as copied objects so they will be balanced with the respective `del_live_object` calls called by the destructor. However, since chunked_vector requires the value_type to be nothrow_move_constructible, just count the additional live object, but do not modify _countdown or, respectively, throw an exception, as this should be considered only for the default and copy constructors. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-05-13 17:18:38 +03:00
Benny Halevy	64c51cf32c	utils: chunked_vector: fill ctor: make exception safe Currently, if the fill ctor throws an exception, the destructor won't be called, as it object is not fully constructed yet. Call the default ctor first (which doesn't throw) to make sure the destructor will be called on exception. Fixes scylladb/scylladb#18635 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-05-13 17:18:38 +03:00
Michał Jadwiszczak	3e5c34831c	test/cql-pytest/test_describe: add test for hiding cdc tables	2024-05-13 16:14:11 +02:00
Michał Jadwiszczak	f12edbdd95	cql3/statements/describe_statement: hide cdc tables Tables with `_scylla_cdc_log` suffix are internal tables used by cdc. We want to hide those tables in all describe statements, as they shouldn't be created by user but created by Scylla when user creates a table with cdc enabled. Instead, we include `ALTER TABLE <cdc log table> WITH <all table properties>` to the description of cdc base table, so all changes to cdc log table's properties are preserved in backup.	2024-05-13 16:11:13 +02:00
Michał Jadwiszczak	05a51c9286	schema: add a method to generate ALTER statement with all properties In the describe statement, we need to generate `ALTER TABLE` statement with all schema's properties for some tables (cdc log tables). The method prints valid CQL statement with current values of the properties.	2024-05-13 16:11:06 +02:00
Michał Jadwiszczak	b62f7a1dd3	schema: extract schema's properties generation In a later commit, we want to add a method to create `ALTER TABLE ... WITH` statement including all schema's properties with current values.	2024-05-13 14:52:32 +02:00
Asias He	952dfc6157	repair: Introduce repair_partition_count_estimation_ratio config option In commit `642f9a1966` (repair: Improve estimated_partitions to reduce memory usage), a 10% hard coded estimation ratio is used. This patch introduces a new config option to specify the estimation ratio of partitions written by repair out of the total partitions. It is set to 0.1 by default. Fixes #18615 Closes scylladb/scylladb#18634	2024-05-13 15:16:55 +03:00
Botond Dénes	afa870a387	Merge 'Some sstable set related improvements' from Raphael "Raph" Carvalho Closes scylladb/scylladb#18616 * github.com:scylladb/scylladb: replica: Make it explicit table's sstable set is immutable replica: avoid reallocations in tablet_sstable_set replica: Avoid compound set if only one sstable set is filled	2024-05-13 14:17:24 +03:00
Pavel Emelyanov	2ce643d06b	table: Directly compare std::optional<shard_id> with shard_id There's a loop that calculates the number of shard matches over a tablet map. The check of the given shard against optional<shard> can be made shorter. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#18592	2024-05-13 13:25:05 +03:00
Andrei Chekun	76a766cab0	Migrate alternator tests to PythonTestSuite As part of the unification process, alternator tests are migrated to the PythonTestSuite instead of using the RunTestSuite. The main idea is to have one suite, so there will be easier to maintain and introduce new features. Introduce the prepare_sql option for suite.yaml to add possibility to run cql statements as precondition for the test suite. Related: https://github.com/scylladb/scylladb/issues/18188 Closes scylladb/scylladb#18442	2024-05-13 13:23:29 +03:00
Avi Kivity	51d09e6a2a	cql3: castas_fcts: do not rely on boost casting large multiprecision integers to floats behavior In [1] a bug casting large multiprecision integers to floats is documented (note that it received two fixes, the most recent and relevant is [2]). Even with the fix, boost now returns NaN instead of ±∞ as it did before [3]. Since we cannot rely on boost, detect the conditions that trigger the bug and return the expected result. The unit test is extended to cover large negative numbers. Boost version behavior: - 1.78 - returns ±∞ - 1.79 - terminates - 1.79 + fix - returns NaN Fixes https://github.com/scylladb/scylladb/issues/18508 [1] https://github.com/boostorg/multiprecision/issues/553 [2] `ea786494db` [3] https://github.com/boostorg/math/issues/1132 Closes scylladb/scylladb#18532	2024-05-13 13:18:28 +03:00
Yaniv Michael Kaul	4639ca1bf5	compaction_strategy.cc: typo -> "performanceimproves" -> "performance improves" Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com> Closes scylladb/scylladb#18629	2024-05-13 08:43:38 +03:00
Patryk Wrobel	ec820e214c	scylla_io_setup: ensure correct RLIMIT_NOFILE for iotune The default limit of open file descriptors per process may be too small for iotune on certain machines with large number of cores. In such case iotune reports failure due to unability to create files or to set up seastar framework. This change configures the limit of open file descriptors before running iotune to ensure that the failure does not occur. The limit is set via 'resource.setrlimit()' in the parent process. The limit is then inherited by the child process. Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com> Closes scylladb/scylladb#18546	2024-05-13 08:35:52 +03:00
Avi Kivity	cc8b4e0630	batchlog_manager, test: initialize delay configuration In `b4e66ddf1d` (4.0) we added a new batchlog_manager configuration named delay, but forgot to initialize it in cql_test_env. This somehow worked, but doesn't with clang 18. Fix it by initializing to 0 (there isn't a good reason to delay it). Also provide a default to make it safer. Closes scylladb/scylladb#18572	2024-05-13 07:57:35 +03:00
Israel Fruchter	a1a6bd6798	Update tools/cqlsh submodule to v6.0.18 * tools/cqlsh e5f5eafd...c8158555 (11): > cqlshlib/sslhandling: fix logic of `ssl_check_hostname` > cqlshlib/sslhandling.py: don't use empty userkey/usercert > Dockerfile: noninteractive isn't enough for answering yet on apt-get > fix cqlsh version print > cqlshlib/sslhandling: change `check_hostname` deafult to False > Introduce new ssl configuration for disableing check_hostname > set the hostname in ssl_options.server_hostname when SSL is used > issue-73 Fixed a bug where username and password from the credentials file were ignored. > issue-73 Fixed a bug where username and password from the credentials file were ignored. > issue-73 > github actions: update `cibuildwheel==v2.16.5` Fixes: scylladb/scylladb#18590 Closes scylladb/scylladb#18591	2024-05-13 07:25:10 +03:00
Yaron Kaikov	3eb81915c1	docker: drop jmx and tools-java from installation Following the work done in `dd0779675f`, removing the scylla-jmx and scylla-tools-java from our docker image Closes scylladb/scylladb#18566	2024-05-13 07:24:23 +03:00
Takuya ASADA	9538af0d95	scylla_kernel_check: fix block device size error on latest mkfs.xfs On latest mkfs.xfs, it does not allow to format a block device which is smaller than 300MB. There are options to ignore this validation but it is unsupported feature, so it is better to increase the loopback image size to "supported size" == 300MB. reference: https://lore.kernel.org/all/164738662491.3191861.15611882856331908607.stgit@magnolia/ Fixes #18568 Closes scylladb/scylladb#18620	2024-05-13 07:23:29 +03:00
Avi Kivity	c8cc47df2d	Merge 'replica: allocate storage groups dynamically' from Aleksandra Martyniuk Allocate storage groups dynamically, i.e.: - on table creation allocate only storage groups that are on this shard; - allocate a storage group for tablet that is moved to this shard; - deallocate storage group for tablet that is moved out of this shard. Output of `./build/release/scylla perf-simple-query -c 1 --random-seed=2248493992` before change: ``` random-seed=2248493992 enable-cache=1 Running test with config: {partitions=10000, concurrency=100, mode=read, frontend=cql, query_single_key=no, counters=no} Disabling auto compaction Creating 10000 partitions... 64933.90 tps ( 63.2 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 42163 insns/op, 0 errors) 65865.36 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 42155 insns/op, 0 errors) 66649.36 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 42176 insns/op, 0 errors) 67029.60 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 42176 insns/op, 0 errors) 68361.21 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 42166 insns/op, 0 errors) median 66649.36 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 42176 insns/op, 0 errors) median absolute deviation: 784.00 maximum: 68361.21 minimum: 64933.90 ``` Output of `./build/release/scylla perf-simple-query -c 1 --random-seed=2248493992` after change: ``` random-seed=2248493992 enable-cache=1 Running test with config: {partitions=10000, concurrency=100, mode=read, frontend=cql, query_single_key=no, counters=no} Disabling auto compaction Creating 10000 partitions... 63744.12 tps ( 63.2 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 42153 insns/op, 0 errors) 66613.16 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 42153 insns/op, 0 errors) 69667.39 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 42184 insns/op, 0 errors) 67824.78 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 42180 insns/op, 0 errors) 67244.21 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 42174 insns/op, 0 errors) median 67244.21 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.2 tasks/op, 42174 insns/op, 0 errors) median absolute deviation: 631.05 maximum: 69667.39 minimum: 63744.12 ``` Fixes: #16877. Closes scylladb/scylladb#17664 * github.com:scylladb/scylladb: test: add test for back and forth tablets migration replica: allocate storage groups dynamically replica: refresh snapshot in compaction_group::cleanup replica: add rwlock to storage_group_manager replica: handle reads of non-existing tablets gracefully service: move to cleanup stage if allow_write_both_read_old fails replica: replace table::as_table_state compaction: pass compaction group id to reshape_compaction_group replica: open code get_compaction_group in perform_cleanup_compaction replica: drop single_compaction_group_if_available	2024-05-12 21:22:02 +03:00
Nadav Har'El	9813ec9446	Merge 'test: perf: add end-to-end benchmark for alternator' from Marcin Maliszkiewicz The code is based on similar idea as perf_simple_query. The main differences are: - it starts full scylla process - communicates with alternator via http (localhost) - uses richer table schema with all dynamoDB types instead of only strings Testing code runs in the same process as scylla so we can easily get various perf counters (tps, instr, allocation, etc). Results on my machine (with 1 vCPU): > ./build/release/scylla perf-alternator-workloads --workdir ~/tmp --smp 1 --developer-mode 1 --alternator-port 8000 --alternator-write-isolation forbid --workload read --duration 10 2> /dev/null ... median 23402.59616090321 median absolute deviation: 598.77 maximum: 24014.41 minimum: 19990.34 > ./build/release/scylla perf-alternator-workloads --workdir ~/tmp --smp 1 --developer-mode 1 --alternator-port 8000 --alternator-write-isolation forbid --workload write --duration 10 2> /dev/null ... median 16089.34211320635 median absolute deviation: 552.65 maximum: 16915.95 minimum: 14781.97 The above seem more realistic than results from perf_simple_query which are 96k and 49k tps (per core). Related: https://github.com/scylladb/scylladb/issues/12518 Closes scylladb/scylladb#13121 * github.com:scylladb/scylladb: test: perf: alternator: add option to skip data pre-population perf-alternator-workloads: add operations-per-shard option test: perf: add global secondary indexes write workload for alternator test: perf: add option to continue after failed request test: perf: add read modify write workload for alternator (lwt) test: perf: add scan workload for alternator test: perf: add end-to-end benchmark for alternator test: perf: extract result aggregation logic to a separate struct	2024-05-12 18:15:29 +03:00
Kefu Chai	fd14b6f26b	test/nodetool: do not accept 1 return code when passing --help to nodetool in `906700d5`, we accepted 0 as well as the return code of "nodetool <command> --help", because we needed to be prepared for the newer seastar submodule while be compatible with the older seastar versions. now that in `305f1bd3`, we bumped up the seastar module, and this commit picked up the change to return 0 when handling "--help" command line option in seastar, we are able to drop the workaround. so, in this change, we only use "0" as the expected return code. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#18627	2024-05-12 14:30:31 +03:00
Avi Kivity	be76527781	Merge 'build: cmake build dist-unified by default and put tarballs under per-config paths' from Kefu Chai in the same spirit of `d57a82c156`, this change adds `dist-unified` as one of the default targets. so that it is built by default. the unified package is required to when redistributing the precompiled packages -- we publish the rpm, deb and tar balls to S3. - [x] cmake related change, no need to backport Closes scylladb/scylladb#18621 * github.com:scylladb/scylladb: build: cmake: use paths to be compatible with CI build: cmake build dist-unified by default	2024-05-12 11:16:03 +03:00
Benny Halevy	796ca367d1	gossiper: rename topo_sm member to _topo_sm Follow scylla convention for class member naming. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#18528	2024-05-12 11:02:35 +03:00
Avi Kivity	2ad13e5d76	auth: complete coroutinization of password_authenticator::create_default_if_missing password_authenticator::create_default_if_missing() is a confusing mix of coroutines and continuations, simplify it to a normal coroutine. Closes scylladb/scylladb#18571	2024-05-11 17:04:20 +03:00
Kefu Chai	1186ddef16	build: cmake: use paths to be compatible with CI our CI workflow for publishing the packages expects the tar balls to be located under `build/$buildMode/dist/tar`, where `$buildMode` is "release" or "debug". before this change, the CMake building system puts the tar balls under "build/dist" when the multi-config generator is used. and `configure.py` uses multi-config generator. in this change, we put the tar balls for redistribution under `build/$<CONFIG>/dist/tar`, where `$<CONFIG>` is "RelWithDebInfo" or "Debug", this works better with the CI workflow -- we just need to map "release" and "debug" to "RelWithDebInfo" and "Debug" respectively. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-05-11 21:56:50 +08:00
Kefu Chai	0f85255c74	build: cmake build dist-unified by default in the same spirit of `d57a82c156`, this change adds `dist-unified` as one of the default targets. so that it is built by default. the unified package is required to when redistributing the precompiled packages -- we publish the rpm, deb and tar balls to S3. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-05-11 18:44:11 +08:00
Raphael S. Carvalho	7faba69f28	replica: Make it explicit table's sstable set is immutable Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2024-05-10 11:58:08 -03:00
Raphael S. Carvalho	55c0272b68	replica: avoid reallocations in tablet_sstable_set reserve upfront wherever possible to avoid reallocations. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2024-05-10 10:44:39 -03:00
Raphael S. Carvalho	35a0d47408	replica: Avoid compound set if only one sstable set is filled Most of the time only main set is filled, so we can avoid one layer of indirection (= compound set) when maintenance set is empty. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2024-05-10 10:44:34 -03:00
Aleksandra Martyniuk	51fdda4199	test: add test for back and forth tablets migration	2024-05-10 15:08:56 +02:00
Aleksandra Martyniuk	b4371a0ea0	replica: allocate storage groups dynamically Currently empty storage_groups are allocated for tablets that are not on this shard. Allocate storage groups dynamically, i.e.: - on table creation allocate only storage groups that are on this shard; - allocate a storage group for tablet that is moved to this shard; - deallocate storage group for tablet that is cleaned up. Stop compaction group before it's deallocated. Add a flag to table::cleanup_tablet deciding whether to deallocate sgs and use it in commitlog tests.	2024-05-10 15:08:21 +02:00
Aleksandra Martyniuk	6e1e082e8c	replica: refresh snapshot in compaction_group::cleanup During compaction_group::cleanup sstables set is updated, but row_cache::_underlaying still keeps a shared ptr to the old set. Due to that descriptors to deleted sstables aren't closed. Refresh snapshot in order to store new sstables set in _underlying mutation source.	2024-05-10 14:56:38 +02:00
Aleksandra Martyniuk	c283746b32	replica: add rwlock to storage_group_manager Add rwlock which prevents storage groups from being added/deleted while some other layers itereates over them (or their compaction groups). Add methods to iterate over storage groups with the lock held.	2024-05-10 14:56:38 +02:00
Aleksandra Martyniuk	54fcb7be53	replica: handle reads of non-existing tablets gracefully In the following patches, storage groups (and so also sstables sets) will be allocated only for tablets that are located on this shard. Some layers may try to read non-existing sstable sets. Handle this case as if the sstables set was empty instead of calling on_internal_error.	2024-05-10 14:56:38 +02:00
Aleksandra Martyniuk	561fb1dd09	service: move to cleanup stage if allow_write_both_read_old fails If allow_write_both_read_old tablet transition stage fails, move to cleanup_target stage before reverting migration. It's a preparation for further patches which deallocate storage group of a tablet during cleanup.	2024-05-10 14:56:38 +02:00
Aleksandra Martyniuk	532653f118	replica: replace table::as_table_state Replace table::as_table_state with table::try_get_table_state_with_static_sharding which throws if a table does not use static sharding.	2024-05-10 14:56:38 +02:00
Aleksandra Martyniuk	cf9913b0b7	compaction: pass compaction group id to reshape_compaction_group Pass compaction group id to shard_reshaping_compaction_task_impl::reshape_compaction_group. Modify table::as_table_state to return table_state of the given compaction group.	2024-05-10 14:56:38 +02:00
Aleksandra Martyniuk	90d618d8c9	replica: open code get_compaction_group in perform_cleanup_compaction Open code get_compaction_group in table::perform_cleanup_compaction as its definition won't be relevant once storage groups are allocated dynamically.	2024-05-10 14:56:38 +02:00
Aleksandra Martyniuk	8505389963	replica: drop single_compaction_group_if_available Drop single_compaction_group_if_available as it's unused.	2024-05-10 14:56:38 +02:00

1 2 3 4 5 ...

42613 Commits