scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-23 01:50:35 +00:00

Author	SHA1	Message	Date
Avi Kivity	b07a0867b3	cql3: expression: introduce nested_expression class The exression type cannot be a member of a struct that is an element of the expression variant. This is because it would then be required to contain itself. So introduce a nested_expression type to indirectly hold an expression, but keep the value semantics we expect from expressions: it is copyable and a copy has separate identity and storage. In fact binary_operator had to resort to this trick, so it's converted to nested_expression in the next patch.	2021-07-27 20:08:21 +03:00
Avi Kivity	8a518e9c78	Convert column_identifier_raw's use as selectable to expressions Introduce unresolved_identifer as an unprepared counterpart to column_value. column_identifier_raw no longer inherits from selectable::raw, but methods for now to reduce churn.	2021-07-27 20:08:15 +03:00
Avi Kivity	d3e8c05bed	make column_identifier::raw forward declarable Otherwise we run into a #include loop when try to have an expression with column_identifier::raw: expression.hh -> column_identifier.hh -> selectable.hh -> expression.hh.	2021-07-27 20:00:48 +03:00
Avi Kivity	0e30a78573	cql3: introduce selectable::with_expression::raw Prepare to migrate selectable::raw sub-classes to expressions by creating a bridge betweet the two types. with_expression::raw is a selectable::raw and implements all its methods (right now, trivially), and its contents is an expression. The methods are implemented using the usual visitor pattern.	2021-07-27 20:00:48 +03:00
Avi Kivity	df4d77e857	table: simplify generate_and_propagate_view_updates exception handling We have both try/catch and handle_exception() to ignore exceptions. Try/catch is enough, so remove handle_exception(). Closes #9011	2021-07-27 14:08:30 +02:00
Avi Kivity	f86e65b4e7	Merge "Fix quadratic behavior in memtable/row_cache with lots of range tombstones" from Tomasz " This series fixes two issues which cause very poor efficiency of reads when there is a lot of range tombstones per live row in a partition. The first issue is in the row_cache reader. Before the patch, all range tombstones up to the next row were copied into a vector, and then put into the buffer until it's full. This would get quadratic if there is much more range tombstones than fit in a buffer. The fix is to avoid the accumulation of all tombstones in the vector and invoke the callback instead, which stops the iteration as soon as the buffer is full. Fixes #2581. The second, similar issue was in the memtable reader. Tests: - unit (dev) - perf_row_cache_update (release) " * tag 'no-quadratic-rt-in-reads-v1' of github.com:tgrabiec/scylla: test: perf_row_cache_update: Uncomment test case for lots of range tombstones row_cache: Consume range tombstones incrementally partition_snapshot_reader: Avoid quadratic behavior with lots of range tombstones tests: mvcc: Relax monotonicity check range_tombstone_stream: Introduce peek_next()	2021-07-27 14:39:13 +03:00
Avi Kivity	05d22d27a8	Merge "Cut repair->storage-service link" from Pavel E " It exists in the node-ops handler which is registered by repair code, but is handled by storage service. Probably, the whole node-ops handler should instead be moved into repair, but this looks like rather huge rework. So instead -- put the node-ops verb registration inside the storage-service. This removes some more calls for global storage service instance and allows slight optimization of node-ops cross-shards calls. tests: unit(dev), start-stop " * 'br-remove-storage-service-from-nodeops' of https://github.com/xemul/scylla: storage_service: Replace globals with locals storage_service: Remove one extra hop of node-ops handler storage_service: Fix indentation after previous patch storage_service: Move cross-shard hop up the stack repair: Drop empty verbs reg/unreg methods repair, storage_service: Move nodeops reg/unreg to storage service repair: Coroutinize row-level start/stop	2021-07-27 13:27:27 +03:00
Takuya ASADA	fdc786b451	install.sh: add supervisor support Bring supervisor support from dist/docker to install.sh, make it installable from relocatable package. This enables to use supervisor with nonroot / offline environment, and also make relocatable package able to run in Docker environment. Related #8849 Closes #8918	2021-07-27 12:51:29 +03:00
Takuya ASADA	42fd73d033	scylla_setup: add RAID5 support This supports optional RAID5 support on scylla_setup. Fixes #9076 Closes #9093	2021-07-27 12:49:29 +03:00
Avi Kivity	2cca461652	Merge 'sstables: merge row consumer interfaces with implementations' from Wojciech Mitros This patch follows #9002, further reducing the complexity of the sstable readers. The split between row consumer interfaces and implementations has been first added in 2015, and there is no reason to create new implementations anymore. By merging those classes, we achieve a sizeable reduction in sstable reader length and complexity. Refs #7952 Tests: unit(dev) Closes #9073 * github.com:scylladb/scylla: sstables: merge row_consumer into mp_row_consumer_k_l sstables: move kl row_consumer sstables: merge consumer_m into mp_row_consumer_m sstables: move mp_row_consumer_m	2021-07-27 12:23:29 +03:00
Benny Halevy	424c53d5b1	mutation_fragment_stream_validator: disambiguate schema member definition gcc 10.3.1 complains that: ``` ./mutation_fragment_stream_validator.hh:39:21: error: declaration of ‘const schema& mutation_fragment_stream_validator::schema() const’ changes meaning of ‘schema’ [-fpermissive] 39 \| const ::schema& schema() const { return _schema; } \| ^~~~~~ ``` Defining the _schama member as `::schema` rather than just `schema` calms the compiler down. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210727073941.1999909-1-bhalevy@scylladb.com>	2021-07-27 11:55:42 +03:00
Nadav Har'El	8030461a2c	cql-pytest: translate Cassandra's misc. type tests This is a translation of Cassandra's CQL unit test source file validation/entities/TypeTest.java into our our cql-pytest framework. This is a tiny test file, with only four test which apparently didn't find their place in other source files. All four tests pass on Cassandra, and all but one pass on Scylla - the test marked xfail discovered one previously-unknown incompatibility with Cassandra: Refs #9082: DROP TYPE IF EXISTS shouldn't fail on non-existent keyspace Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210726140934.1479443-1-nyh@scylladb.com>	2021-07-27 08:28:16 +03:00
Tomasz Grabiec	7578cef0a4	test: perf_row_cache_update: Uncomment test case for lots of range tombstones	2021-07-26 21:38:00 +02:00
Gleb Natapov	56d0f711e8	serialize: allow use non copyable types with std::variant Message-Id: <20210720120935.710549-3-gleb@scylladb.com>	2021-07-26 19:09:19 +03:00
Gleb Natapov	63025a75b2	serialize: allow use non copyable types with std::optional Message-Id: <20210720120935.710549-2-gleb@scylladb.com>	2021-07-26 19:09:19 +03:00
Avi Kivity	8a80e455fb	sstables: keys: convert trichotomic comparisons to std::strong_ordering Prevent accidental conversions to bool from yielding the wrong results. Unprepared users (that converted to bool, or assigned to int) are adjusted. Ref #1449 Test: unit (dev) Closes #9088	2021-07-26 19:09:19 +03:00
Nadav Har'El	d3a715e0ff	Update seastar submodule * seastar 93d053cd...ce3cc268 (4): > doc: update coroutine exception paragraph with make_exception > coroutine: add make_exception helper > coroutine: use std::move for forwarding exception_ptr > doc: tutorial: document direct exception propagation With the new throw-less coroutine exception support, we can modify some of Scylla's new coroutine code to generate exceptions a bit more efficiently, without actually thowing an exception.	2021-07-26 19:09:19 +03:00
Tomasz Grabiec	2d18360157	row_cache: Consume range tombstones incrementally Before the patch, all range tombstones up to the next row were copied into a vector, and then put into the buffer until it's full. This would get quadratic if there is much more range tombstones than fit in a buffer. The fix is to avoid the accumulation of all tombstones in the vector and invoke the callback instead, which stops the iteartion as soon as the buffer is full. Fixes #2581.	2021-07-26 17:48:05 +02:00
Tomasz Grabiec	e74c3c885e	partition_snapshot_reader: Avoid quadratic behavior with lots of range tombstones next_range_tombstone() was populating _rt_stream on each invocation from the current iterator ranges in _range_tombstones. If there is a lot of range tombstones, all would be put into _rt_stream. One problem is that this can cause a reactor stall. Fix by more incremental approach where we populate _rt_stream with minimal amount on each invocation of next_range_tombstone(). Another problem is that this can get quadratic. The iterators in _range_tombstones are advanced, but if lsa invalidates them across calls they can revert back to the front since they go back to _last_rt, which is the last consumed range tombstone, and if the buffer fills up, not all tombstones from _rt_stream could be consumed. The new code doesn't have this problem because everything which is produced out of the iterators in _range_tombstones is produced only once. What we put into _rt_stream is consumed first before we try to feed the _rt_stream with more data.	2021-07-26 17:48:05 +02:00
Tomasz Grabiec	0d7b3f9463	tests: mvcc: Relax monotonicity check Consecutive range tombstones can have the same position. They will, in one of the test cases, after the range tombstone merger in partition_snapshot_flat_reader no longer uses range_tombstone_list to merge data form multiple versions, which deoverlaps, but rather merges the streams corresponding to each version, which interleaves range tombstones from different versions.	2021-07-26 17:27:03 +02:00
Tomasz Grabiec	91868cf0cd	range_tombstone_stream: Introduce peek_next()	2021-07-26 13:33:34 +02:00
Pavel Emelyanov	11a2709f10	storage_service: Replace globals with locals The node-ops verb handler is the lambda of storage-service and it can stop using global storage service instance for no extra charge. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-26 14:21:30 +03:00
Pavel Emelyanov	6e56671d9e	storage_service: Remove one extra hop of node-ops handler It's now clear that the verb handler goes to some "random" shard, then immediatelly switches to shard-0 and then does the handling. Avoid the extra hop and go to shard-0 right at once. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-26 14:21:30 +03:00
Pavel Emelyanov	b6315d3af7	storage_service: Fix indentation after previous patch And, while at it, s/ss/this/g and drop the ss variable. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-26 14:21:30 +03:00
Pavel Emelyanov	f5fad311cf	storage_service: Move cross-shard hop up the stack The storage_service::node_ops_cmd_handler runs inside a huge invoke_on(0, ...) lambda. Make it be called on shard-0. This is the preparation for next two patches. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-26 14:21:30 +03:00
Pavel Emelyanov	eb55c252c9	repair: Drop empty verbs reg/unreg methods Those in repair.cc's are now noops, so remove them. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-26 14:21:30 +03:00
Pavel Emelyanov	a09586a237	repair, storage_service: Move nodeops reg/unreg to storage service The storage service is the verb sender, so it must be the verb registrator. Another goal of this patch is to allow removal of repair -> storage_service dependency. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-26 14:21:21 +03:00
Pavel Emelyanov	18397a5e0a	repair: Coroutinize row-level start/stop Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-26 14:21:21 +03:00
Piotr Jastrzebski	90a607e844	api: use proper type to reduce partition count Partition count is of a type size_t but we use std::plus<int> to reduce values of partition count in various column families. This patch changes the argument of std::plus to the right type. Using std::plus<int> for size_t compiles but does not work as expected. For example plus<int>(2147483648LL, 1LL) = -2147483647 while the code would probably want 2147483649. Fixes #9090 Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Closes #9074	2021-07-26 11:53:06 +03:00
Nadav Har'El	b503ec36c2	cql-pytest: translate Cassandra's tests for tuples This is a translation of Cassandra's CQL unit test source file validation/entities/TupleTypeTest.java into our our cql-pytest framework. This test file checks has a few tests on various features of tuples. Unfortunately, some of the tests could not be easily translated into Python so were left commented out: Some tests try to send invalid input to the server which the Python driver "helpfully" forbids; Two tests used an external testing library "QuickTheories" and are the only two tests in the Cassandra test suite to use this library - so it's not a worthwhile to translate it to Python. 11 tests remain, all of them pass on Cassandra, and just one fails on Scylla (so marked xfail for now), reproducing one known issue: Refs #7735: CQL parser missing support for Cassandra 3.10's new "+=" syntax Actually, += is not supposed to be supported on tuple columns anyway, but should print the appropriate error - not the syntax error we get now as the "+=" feature is not supported at all. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210722201900.1442391-1-nyh@scylladb.com>	2021-07-26 08:20:12 +03:00
Benny Halevy	8674746fdd	flat_mutation_reader: detach_buffer: mark as noexcept Since detach_buffer is used before closing and destroying the reader, we want to mark it as noexcept to simply the caller error handling. Currently, although it does construct a new circular_buffer, none of the constructors used may throw. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210617114240.1294501-2-bhalevy@scylladb.com>	2021-07-25 12:02:27 +03:00
Benny Halevy	0e31cdf367	flat_mutation_reader: detach_buffer: clarify buffer constructor detach_buffer exchanges the current _buffer with a new buffer constructed using the circular_buffer(Alloc) constructor. The compiler implicitly constructs a tracking_allocator(reader_permit) and passes it to the circular_buffer constructor. This patch just makes that explicit so it would be clearer to the reader what's going on here. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210617114240.1294501-1-bhalevy@scylladb.com>	2021-07-25 11:59:37 +03:00
Pavel Solodovnikov	bcbcc18aa1	raft: raft_sys_table_storage: fix broken `load_snapshot` and `load_term_and_vote` Loading snapshot id and term + vote involve selecting static fields from the "system.raft" table, constrained by a given group id. The code incorrectly assumes that, for example, `SELECT snapshot_id FROM raft WHERE group_id=?` in `load_snapshot` always returns only one row. This is not true, since this will return a row for each (pk, ck) combination, which is (group_id, index) for "system.raft" table. The same applies for the `load_term_and_vote`, which selects static `vote_term` and `vote` from "system.raft". This results in a crash at node startup when there is a non-empty raft log containing more than one entry for a given `group_id`. Restrict the selection to always return one row by applying `LIMIT 1` clause. Tests: unit(dev) Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com> Message-Id: <20210723183232.742083-1-pa.solodovnikov@scylladb.com>	2021-07-25 02:01:34 +02:00
Nadav Har'El	ec5e4c338b	cql: fix undefined behavior in timestamp verification Commit `2150c0f7a2` proposed by issue #5619 added a limitation that USING TIMESTAMP cannot be more than 3 days into the future. But the actual code used to check it, timestamp - now > MAX_DIFFERENCE only makes sense for positive timestamps. For negative timestamps, which are allowed in Cassandra, the difference "timestamp - now" might overflow the signed integer and the result is undefined - leading to the undefined-behavior sanitizer to complain as reported in issue #8895. Beyond the sanitizer, in practice, on my test setup, the timestamp -2^63+1 causes such overflow, which causes the above if() to make the nonsensical statement that the timestamp is more than 3 days into the future. This patch assumes that negative timestamps of any magnitude are still allowed (as they are in Cassandra), and fixes the above if() to only check timestamps which are in the future (timestamp > now). We also add a cql-pytest test for negative timestamps, passing on both Cassandra and Scylla (after this patch - it failed before, and also reported sanitizer errors in the debug build). Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210621141255.309485-1-nyh@scylladb.com>	2021-07-24 11:01:08 +03:00
Tomasz Grabiec	b044db863f	Merge 'db/virtual_table: Streaming tables for large data + describe_ring example table' from Juliusz Stasiewicz This is the 2nd PR in series with the goal to finish the hackathon project authored by @tgrabiec, @kostja, @amnonh and @mmatczuk (improved virtual tables + function call syntax in CQL). This one introduces a new implementation of the virtual tables, the streaming tables, which are suitable for large amounts of data. This PR was created by @jul-stas and @StarostaGit Closes #8961 * github.com:scylladb/scylla: test/boost: run_mutation_source_tests on streaming virtual table system_keyspace: Introduce describe_ring table as virtual_table storage_service: Pass the reference down to system_keyspace endpoint_details: store `_host` as `gms::inet_address` queue_reader: implement next_partition() virtual_tables: Introduce streaming_virtual_table flat_mutation_reader: Add a new filtering reader factory method	2021-07-23 18:05:51 +02:00
Gleb Natapov	f0047bd749	raft: apply snapshots in applier_fiber We want to serialize snapshot application with command application otherwise a command may be applied after a snapshot that already contains the result of its application (it is not necessary a problem since the raft by itself does not guaranty apply-once semantics, but better to prevent it when possible). This also moves all interactions with user's state machine into one place. Message-Id: <YPltCmBAGUQnpW7r@scylladb.com>	2021-07-23 18:05:38 +02:00
Avi Kivity	aaf35b5ac2	Merge "Remove storage-service from transport (and a bit more)" from Pavel E " The cql-server -> storage-service dependency comes from the server's event_notifier which (un)subscribes on the lifecycle events that come from the storage service. To break this link the same trick as with migration manager notifications is used -- the notification engine is split out of the storage service and then is pushed directly into both -- the listeners (to (un)subscribe) and the storage service (to notify). tests: unit(dev), dtest(simple_boot_shutdown, dev) manual({ start/stop, with/without started transport, nodetool enable-/disablebinary } in various combinations, dev) " * 'br-remove-storage-service-from-transport' of https://github.com/xemul/scylla: transport.controller: Brushup cql_server declarations code: Remove storage-service header from irrelevant places storage_service: Remove (unlifecycle) subscribe methods transport: Use local notifier to (un)subscribe server transport: Keep lifecycle notifier sharded reference main: Use local lifecycle notifier to (un)subscribe listeners main, tests: Push notifier through storage service storage_service: Move notification core into dedicated class storage_service: Split lifecycle notification code transport, generic_server: Remove no longer used functionality transport: (Un)Subscribe cql_server::event_notifier from controller tests: Remove storage service from manual gossiper test	2021-07-22 19:27:45 +03:00
Pavel Emelyanov	b1bb00a95c	transport.controller: Brushup cql_server declarations The controller code sits in the cql_transport namespace and can omit its mentionings. Also the seastar::distributed<> is replaced with modern seastar::sharded<> while at it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-22 18:50:57 +03:00
Pavel Emelyanov	c39f04fa6f	code: Remove storage-service header from irrelevant places Some .cc files over the code include the storage service for no real need. Drop the header and include (in some) what's really needed. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-22 18:50:19 +03:00
Pavel Emelyanov	e711bfbb7e	storage_service: Remove (unlifecycle) subscribe methods All the listeners now use main-local notifier instance directly and these methods become unused. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-22 18:49:35 +03:00
Pavel Emelyanov	65b1bb8302	transport: Use local notifier to (un)subscribe server Now the controller has the lifecycle notifier reference and can stop using storage service to manage the subscription. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-22 18:48:58 +03:00
Pavel Emelyanov	5f99eeb35e	transport: Keep lifecycle notifier sharded reference It's needed to (un)subscribe server on it (next patch). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-22 18:48:20 +03:00
Pavel Emelyanov	2a30cb1664	main: Use local lifecycle notifier to (un)subscribe listeners The storage proxy and sl-manager get subscribed on lifecycle events with the help of storage service. Now when the notifier lives in main() they can use it directly. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-22 18:47:15 +03:00
Pavel Emelyanov	8248bc9e33	main, tests: Push notifier through storage service Now it's time to move the lifecycle notifier from storage service to the main's scope. Next patches will remove the $lifecycle-subscriber -> storage_service dependency. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-22 18:45:51 +03:00
Pavel Emelyanov	6b3b01d9a6	storage_service: Move notification core into dedicated class Introduce the endpoint_lifecycle_notifier class that's in charge of keeping track of subscribers and notifying them. The subscribers will thus be able to set and unset their subscription without the need to mess with storage service at all. The storage_service for now keeps the notifier on board, but this is going to change in the next patch. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-22 18:44:02 +03:00
Pavel Emelyanov	7e8a032013	storage_service: Split lifecycle notification code This prepares the ground for moving the notification engine into own class like it was done for migration_notifier some time ago. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-22 18:43:14 +03:00
Pavel Emelyanov	c7b0b25494	transport, generic_server: Remove no longer used functionality After subscription management was moved onto controller level a bunch of code can be dropped: - passing migration notifier beyond controller - event_notifier's _stopped bit - event_notifier .stop() method - event_notifier empty constructor and destrictor - generic_server's on_stop virtual method Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-22 18:41:32 +03:00
Pavel Emelyanov	1acef41626	transport: (Un)Subscribe cql_server::event_notifier from controller There's a migration notifier that's carried through cql_server _just_ to let event-notifier (un)subscribe on it. Also there's a call for global storage-service in there which will need to be replaced with yet another pass-through argument which is not great. It's easier to establish this subscription outside of cql_server like it's currently done for proxy and sl-manager. In case of cql_server the "outside" is the controller. This patch just moves the subscription management from cql_server to controller, next two patches will make more use of this change. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-22 18:37:23 +03:00
Pavel Emelyanov	b57fb0aa9a	tests: Remove storage service from manual gossiper test It's not needed there, gossiper starts and works without it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-07-22 18:36:28 +03:00
Yaron Kaikov	a004b1da30	scylla_util:add AWS arm based instance to supported list Today we have a Scylla AMI image based on x86 archituctre only. Following the work we did in https://github.com/scylladb/scylla-machine-image/pull/153 we can build ARM based AMI image Let's add ARM based instance to supported list Closes #9064	2021-07-22 15:48:29 +03:00

1 2 3 4 5 ...

27585 Commits