scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-22 07:42:16 +00:00

Author	SHA1	Message	Date
lauranovich	e78746e94d	docs: fix removal of master from website drop-down Closes #9251	2021-08-26 14:51:37 +03:00
Avi Kivity	9bf3b9f964	Merge 'Some IDL compiler cleanups' from Pavel Solodovnikov This series incorporates various refactorings aimed mostly at eliminating extra parameters to `serializer__impl` functions for `EnumDef` and `ClassDef` AST classes. Instead of carrying these parameters here and there over many places, they are calculated on a preliminary run to collect additional metadata, such as: namespaces and template parameters from parent scopes. This metadata is used later to extend AST classes. The patchset does not introduce any changes in the generation procedures, exclusively dealing with internal code structuring. NOTE: although metadata collection involves an extra run through the parse tree, the proper way should be to populate it instantly while parsing the input. This is left to be adjusted lated in a follow-up series. Closes #8148 github.com:scylladb/scylla: idl: add descriptions for the top-level generation routines idl: make ns_qualified name a class method idl: cache template declarations inside enums and classes idl: cache parent template params for enums and classes idl: rename misleading `local_types` to `local_writable_types` idl: remove remaining uses of `namespaces` argument idl: remove `is_final` function and use `.final` AST class property idl: remove `parent_template_param` from `local_types` set idl: cache namespaces in AST nodes idl: remove unused variables	2021-08-26 13:18:54 +03:00
Benny Halevy	4ffdafe6dc	token_metadata: delete old java code We no longer need to keep it for reference. It's just causing confusion at this point. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210826095457.994834-1-bhalevy@scylladb.com>	2021-08-26 13:03:59 +03:00
Pekka Enberg	a53c1949cd	Update tools/jmx submodule * tools/jmx 5311e9b...70b19e6 (1): > scrub: support scrubMode and deprecate skipCorrupted	2021-08-26 12:27:13 +03:00
Pavel Solodovnikov	c0854a0f62	raft: create system tables only when `raft` experimental feature is set Also introduce a tiny function to return raft-enabled db config for cql testing. Tests: unit(dev) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20210826091432.279532-1-pa.solodovnikov@scylladb.com>	2021-08-26 12:21:12 +03:00
Pekka Enberg	bd8fa47d84	Update tools/java submodule * tools/java 4ef8049e07...0b6ecbeb90 (1): > nodetool scrub: support --mode and deprecate --skip-corrupted	2021-08-26 11:07:14 +03:00
Avi Kivity	acf8da2bce	Merge "flat_mutation_reader: keep timeout in permit" from Benny " This series moves the timeout parameter, that is passed to most f_m_r methods, into the reader_permit. This eliminates the need to pass the timeout around, as it's taken from the permit when needed. The permit timeout is updated in certain cases when the permit/reader is paused and retrieved later on for reuse. Following are perf_simple_query results showing ~1% reduction in insns/op and corresponding increase in tps. $ build/release/test/perf/perf_simple_query -c 1 --operations-per-shard 1000000 --task-quota-ms 10 Before: 102500.38 tps ( 75.1 allocs/op, 12.1 tasks/op, 45620 insns/op) After: 103957.53 tps ( 75.1 allocs/op, 12.1 tasks/op, 45372 insns/op) Test: unit(dev) DTest: repair_additional_test.py:RepairAdditionalTest.repair_abort_test (release) materialized_views_test.py:TestMaterializedViews.remove_node_during_mv_insert_3_nodes_test (release) materialized_views_test.py:InterruptBuildProcess.interrupt_build_process_with_resharding_half_to_max_test (release) migration_test.py:TTLWithMigrate.big_table_with_ttls_test (release) " * tag 'reader_permit-timeout-v6' of github.com:bhalevy/scylla: flat_mutation_reader: get rid of timeout parameter reader_concurrency_semaphore: use permit timeout for admission reader_concurrency_semaphore: adjust reactivated reader timeout multishard_mutation_query: create_reader: validate saved reader permit repair: row_level: read_mutation_fragment: set reader timeout flat_mutation_reader: maybe_timed_out: use permit timeout test: sstable_datafile_test: add sstable_reader_with_timeout reader_permit: add timeout member	2021-08-25 17:51:10 +03:00
Raphael S. Carvalho	a4053dbb72	repair: Postpone data segregation to off-strategy compaction With data segregation on repair, thousands of sstables are potentially added to maintenance set which causes high latency due to stalls. That's because N*M sstables are created by a repair, where N = # of ranges and M = # of segregations For TWCS, M = # of windows. Assuming N = 768 and M = 20, ~15k sstables end up in sstable set To fix this problem, let's avoid performing data segregation in repair, as offstrategy will already perform the segregation anyway. So from now on, only N non-overlapping sstables will be added to set. Read amplification isn't affected because a query will only touch one sstable in maintenance set. When offstrategy starts, it will pick all sstables from set and compact them in a single step while performing data segregation, so data is properly laid out before integrated into the main set. tests: - sstable_compaction_test.twcs_reshape_with_disjoint_set_test - mode(dev) - manual test using repair-based bootstrap Fixes #9199. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210824185043.76475-1-raphaelsc@scylladb.com>	2021-08-25 15:31:38 +03:00
Pavel Emelyanov	b012040a76	mutation: Keep range tombstone in tree when consuming Current code std::move()-s the range tombstone into consumer thus moving the tombstone's linkage to the containing list as well. As the result the orignal range tombstone itself leaks as it leaves the tree and cannot be reached on .clear(). Another danger is that the iterator pointing to the tombstone becomes invalid while it's then ++-ed to advance to the next entry. The immediate fix is to keep the tombstone linked to the list while moving. fixes: #9207 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20210825100834.3216-1-xemul@scylladb.com>	2021-08-25 13:25:18 +03:00
Botond Dénes	6df77e350a	mutation_fragment{_v2}: MutationFragmentConsumer: allow for abstract consumer Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20210825083244.436274-1-bdenes@scylladb.com>	2021-08-25 13:12:41 +03:00
Avi Kivity	993f824cfd	Merge "raft: implement linearisable reads on a follower" from Gleb and Kostja " This series implements section 6.4 of the Raft PhD. It allows to do linearisable reads on a follower bypassing raft log entirely. After this series server::read_barrier can be executed on a follower as well as leader and after it completes local user's state machine state can be accessed directly. " * 'raft-read-v9' of github.com:scylladb/scylla-dev: raft: test: add read_barrier test to replication_test raft: test: add read_barrier tests to fsm_test raft: make read_barrier work on a follower as well as on a leader raft: add a function to wait for an index to be applied raft: (server) add a helper to wait through uncertainty period raft: make fsm::current_leader() public raft: add hasher for raft::internal::tagged_uint64 serialize: add serialized for std::monostate raft: fix indentation in applier_fiber	2021-08-25 13:11:35 +03:00
Gleb Natapov	3ff6f76cef	raft: test: add read_barrier test to replication_test	2021-08-25 08:57:13 +03:00
Gleb Natapov	ad2c2abcb8	raft: test: add read_barrier tests to fsm_test	2021-08-25 08:57:13 +03:00
Gleb Natapov	03a266d73b	raft: make read_barrier work on a follower as well as on a leader This patch implements RAFT extension that allows to perform linearisable reads by accessing local state machine. The extension is described in section 6.4 of the PhD. To sum it up to perform a read barrier on a follower it needs to asks a leader the last committed index that it knows about. The leader must make sure that it is still a leader before answering by communicating with a quorum. When follower gets the index back it waits for it to be applied and by that completes read_barrier invocation. The patch adds three new RPC: read_barrier, read_barrier_reply and execute_read_barrier_on_leader. The last one is the one a follower uses to ask a leader about safe index it can read. First two are used by a leader to communicate with a quorum.	2021-08-25 08:57:13 +03:00
Gleb Natapov	73af7edc78	raft: add a function to wait for an index to be applied	2021-08-25 08:19:25 +03:00
Konstantin Osipov	0429196e06	raft: (server) add a helper to wait through uncertainty period Add a helper to be able to wait until a Raft cluster leader is elected. It can be used to avoid sleeps when it's necessary to forward a request to the leader, but the leader is yet unknown.	2021-08-25 08:19:25 +03:00
Gleb Natapov	376785042f	raft: make fsm::current_leader() public Later patch will call it from server class.	2021-08-25 08:19:25 +03:00
Gleb Natapov	273f753815	raft: add hasher for raft::internal::tagged_uint64 Need it to be able to use tagged_uint64 as a key in an unordered map.	2021-08-25 08:19:25 +03:00
Gleb Natapov	4851d64c68	serialize: add serialized for std::monostate	2021-08-25 08:19:25 +03:00
Gleb Natapov	bd0fd579cf	raft: fix indentation in applier_fiber	2021-08-25 08:19:25 +03:00
Nadav Har'El	cf06b7cd40	test/alternator: correct some typos in comments Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210729125317.1610573-1-nyh@scylladb.com>	2021-08-24 19:43:29 +03:00
Avi Kivity	4a42b69ba8	Merge "raft: testing: many nodes test" from Alejo " Factor out replication test, make it work with different clocks, add some features, and add a many nodes test with steady_clock. Also refactor common test helper. Many nodes test passes for release and dev and normal tick of 100ms for up to 1000 servers. For debug mode it's much fewer due to lack of optimizations so it's only tested for smaller numbers. Tests: unit ({dev}), unit ({debug}), unit ({release}) " * 'raft-many-22-v12' of https://github.com/alecco/scylla: (21 commits) raft: candidate timeout proportional to cluster size raft: testing: many nodes test raft: replication test: remove unused tick_all raft: replication test: delays raft: replication test: packet drop rpc helper raft: replication test: connectivity configuration raft: replication test: rpc network map in raft_cluster raft: replication test: use minimum granularity raft: replication test: minor: rename local to int ids raft: replication test: fix restart_tickers when partitioning raft: replication test: partition ranges raft: replication test: isolate one server raft: replication test: move objects out of header raft: replication test: make dummy command const raft: replication test: template clock type raft: replication test: tick delta inside raft_cluster raft: replication test: style - member initializer raft: replication test: move common code out raft: testing: refactor helper raft: log election stages ...	2021-08-24 17:05:05 +03:00
Benny Halevy	4476800493	flat_mutation_reader: get rid of timeout parameter Now that the timeout is taken from the reader_permit. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 16:30:51 +03:00
Benny Halevy	4e3dcfd7d6	reader_concurrency_semaphore: use permit timeout for admission Now that the timeout is stored in the reader permit use it for admission rather than a timeout parameter. Note that evictable_reader::next_partition currently passes db::no_timeout to resume_or_create_reader, which propagated to maybe_wait_readmission, but it seems to be an oversight of the f_m_r api that doesn't pass a timeout to next_partition(). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 16:30:51 +03:00
Benny Halevy	9b0b13c450	reader_concurrency_semaphore: adjust reactivated reader timeout Update the reader's timeout where needed after unregistering inactive_read. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 16:30:51 +03:00
Benny Halevy	605a1e6943	multishard_mutation_query: create_reader: validate saved reader permit Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 16:30:51 +03:00
Benny Halevy	eeab5f77d9	repair: row_level: read_mutation_fragment: set reader timeout The timeout needs to be propagated to the reader's permit. Reset it to db::no_timeout in repair_reader::pause(). Warn if set_timeout asks to change the timeout too far into the past (100ms). It is possible that it will be passed a past timeout from the rcp path, where the message timeout is applied (as duration) over the local lowres_clock time and parallel read_data messages that share the query may end up having close, but different timeout values. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 16:30:40 +03:00
Benny Halevy	f25aabf1b2	flat_mutation_reader: maybe_timed_out: use permit timeout Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 14:29:44 +03:00
Benny Halevy	46fb7fe68e	test: sstable_datafile_test: add sstable_reader_with_timeout Verify that the sstable reader (for the highest supported version) times out properly. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 14:29:44 +03:00
Benny Halevy	fe479aca1d	reader_permit: add timeout member To replace the timeout parameter passed to flat_mutation_reader methods. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 14:29:44 +03:00
Alejo Sanchez	a5c74a6442	raft: candidate timeout proportional to cluster size To avoid dueling candidates with large clusters, make the timeout proportional to the cluster size. Debug mode is too slow for a test of 1000 nodes so it's disabled, but the test passes for release and dev modes. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-24 13:09:01 +02:00
Alejo Sanchez	7206eae16e	raft: testing: many nodes test Tests with many nodes and realistic timers and ticks. Network delays are kept as a fraction of ticks. (e.g. 20/100) Tests with 600 or more nodes hang in debug mode. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-24 13:09:01 +02:00
Alejo Sanchez	87a03a3485	raft: replication test: remove unused tick_all Tests now wait for normal ticks for election, remove deprecated tick_all helper. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-24 13:09:01 +02:00
Alejo Sanchez	14c214d73e	raft: replication test: delays Allow test supplied delays for rpc communication. Allow supplying network delay, local delay (nodes within the same server), how many nodes are local, and an extra small delay simulating local load. Modify rpc class to support delays. If delays are enabled, it no longer directly calls the other node's server code but it schedules it to be called later. This makes the test more realistic as in the previous version the first candidate was always going to get to all followers first, preventing a dueling candidates scenario. Previously, tickers were all scheduled at the same time, so there was no spread of them across the tick time. Now these tickers are scheduled with a uniform spread across this time (tick delta). Also previously, for custom free elections used tick_all() which traversed _in_configuration sequentially and ticked each. This, combined with rpc outbound directly calling methods in the other server without yielding, caused free elections to be unrealistic with same order determined and first candidate always winning. This patch changes this behavior. The free election uses normal tickers (now uniformly distributed in tick delay time) and its loop waits for tick delay time (yielding) and checks if there's a new leader. Also note the order might not be the same in debug mode if more than one tick is scheduled. As rpc messages are sent delayed, network connectivity needs to be checked again before calling the function on the remote side. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-24 13:05:53 +02:00
Alejo Sanchez	db23823c77	raft: replication test: packet drop rpc helper Add a helper to check if a packet should be dropped. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-23 17:50:16 +02:00
Alejo Sanchez	497af3167f	raft: replication test: connectivity configuration Pass packet drops within connectivity configuration struct. Default to no packet drops. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-23 17:50:16 +02:00
Alejo Sanchez	e4d5428e8a	raft: replication test: rpc network map in raft_cluster Move rpc network map to raft cluster, no longer as static in rpc class.	2021-08-23 17:50:16 +02:00
Alejo Sanchez	192ac5be4c	raft: replication test: use minimum granularity seastar lowres_clock minimum granularity is 10ms, not 1ms. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-23 17:50:16 +02:00
Alejo Sanchez	5cfe6c1ca2	raft: replication test: minor: rename local to int ids For clarity, name 0-based integer ids as int ids not local. This is in contrast with 1-based UUID ids.	2021-08-23 17:50:16 +02:00
Alejo Sanchez	27d90f0165	raft: replication test: fix restart_tickers when partitioning When partitioning, elect_new_leader restarts tickers, so don't re-restart them in this case. When leader is dropped and no new leader is specified, restart tickers before free election. If no change of leader, restart tickers. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-23 17:50:16 +02:00
Alejo Sanchez	e4262291f2	raft: replication test: partition ranges Allow specifying ranges within partition to handle large number of nodes. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-23 17:50:16 +02:00
Alejo Sanchez	56a110d42f	raft: replication test: isolate one server Support disconnection of one server with the rest. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-23 17:50:16 +02:00
Alejo Sanchez	6b3327c753	raft: replication test: move objects out of header Use a separate cc file for definitions and objects. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-23 17:50:16 +02:00
Alejo Sanchez	cea18e6830	raft: replication test: make dummy command const Make dummy command const in header. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-23 17:50:16 +02:00
Alejo Sanchez	2db3192ac3	raft: replication test: template clock type Templetize clock type. Use a struct for run_test to work around https://bugs.llvm.org/show_bug.cgi?id=50345 With help from @kbr- Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-23 17:50:16 +02:00
Alejo Sanchez	cb35588fb1	raft: replication test: tick delta inside raft_cluster Store tick delta inside raft_cluster. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-23 17:50:16 +02:00
Alejo Sanchez	49cb040037	raft: replication test: style - member initializer Fix raft_cluster constructor member initializer list.	2021-08-23 17:50:16 +02:00
Alejo Sanchez	6e2ab657b3	raft: replication test: move common code out Common replication test code moved to header. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-23 17:50:16 +02:00
Alejo Sanchez	a6cd35c512	raft: testing: refactor helper Move definitions to helper object file. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-23 17:50:16 +02:00
Alejo Sanchez	466972afb0	raft: log election stages Add logging for election tracing. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2021-08-23 17:50:16 +02:00

1 2 3 4 5 ...

27990 Commits