scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-02 06:05:53 +00:00

Author	SHA1	Message	Date
Gleb Natapov	7ca24efb39	streaming: always read from rpc::source until end-of-stream during mutation sending rpc::source cannot be abandoned until EOS is reached, but current code does not obey it if error code is received, it throws exception instead that aborts the reading loop. Fix it by moving exception throwing out of the loop. Fixes: #4025 Message-Id: <20181227135051.GC29458@scylladb.com> (cherry picked from commit `37b4043677`)	2018-12-27 18:59:59 +02:00
Avi Kivity	32ebaaa585	Update libdeflate submodule * libdeflate 17ec6c9...e7e54ea (1): > build: improve out-of-tree build with multiple output trees (cherry picked from commit `d6a22c50cb`)	2018-12-25 14:41:24 +02:00
Nadav Har'El	a88c722a4c	build_ami.sh: need to check out the right branch of scylla-jmx This patch is for branch 3.0's build_ami.sh. It checks out the latest master branch of scylla-jmx, which not only sounds wrong, it also doesn't work: the latest master of scylla-jmx can only build a "relocatable package" but branch 3.0 doesn't work with those. This patch needs to be applied only in branch 3.0. It should probably be made more general, though... build_ami.sh should have been able to figure out what is the current branch, and if it is branch-3.0 or next-3.0, check out branch-3.0 of the other repositories. But I'm not sure how to do this correctly. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20181217214610.4498-1-nyh@scylladb.com>	2018-12-25 12:37:11 +02:00
Tomasz Grabiec	07582d6c10	sstables: index_reader: Fix abort when _trust_pi == trust_promoted_index::no data is not moved-from if _trust_pi == trust_promoted_index::no, which triggers the assert on data.empty(). We should make it empty unconditionally. Message-Id: <1545408731-14333-1-git-send-email-tgrabiec@scylladb.com> (cherry picked from commit `419c771791`)	2018-12-24 11:45:14 +02:00
Tomasz Grabiec	18c89edbf7	sstables: mc: reader: Use enum class instead of variant variant is an overkill here. Message-Id: <1545409014-16289-1-git-send-email-tgrabiec@scylladb.com> (cherry picked from commit `07d153c769`)	2018-12-24 11:45:09 +02:00
Duarte Nunes	5558fa8c44	service/storage_proxy: Protect against empty mutation when storing hint mutation_holder::get_mutation_for() can return nullptr's, so protect against those when storing a hint. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20181221194853.98775-2-duarte@scylladb.com> (cherry picked from commit `e6a8883228`) scylla-3.0.rc3	2018-12-23 12:27:27 +02:00
Duarte Nunes	f678eb52cd	service/storage_proxy: Protect against empty mutation in mutation_holder The per_destination_mutation holder can contain empty mutations, so make sure release_mutation() skips over those. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20181221194853.98775-1-duarte@scylladb.com> (cherry picked from commit `6c4a34f378`)	2018-12-23 12:27:25 +02:00
Tomasz Grabiec	dfb23f4b38	sstables: mc: index_reader: Handle CK_SIZE split across buffers properly we incorrectly falled-through to the next state instead of returning to read more data. This can manifest in a number of ways, an abort, or incorrect read. Introduced in `917528c` Fixes #4011. Message-Id: <1545402032-4114-1-git-send-email-tgrabiec@scylladb.com> (cherry picked from commit `d2f96a60f6`)	2018-12-21 20:40:35 +02:00
Tomasz Grabiec	502ddf158a	sstables: mc: reader: Avoid unnecessary index reads on fast forwarding When the next pending fragments are after the start of the new range, we know there is no need to skip. Caught by perf_fast_forward --datasets large-part-ds3 \ --run-tests=large-partition-slicing Refs #3984 Message-Id: <1545308006-16389-1-git-send-email-tgrabiec@scylladb.com> (cherry picked from commit `7afe2bad51`)	2018-12-21 20:40:35 +02:00
Paweł Dziepak	0ccb0a127a	Merge "Optimize slicing sstable readers" from Tomasz " Contains several improvements for fast-forwarding and slicing readers. Mainly for the MC format, but not only: - Exiting the parser early when going out of the fast-forwarding window [MC-format-only] - Avoiding reading of the head of the partition when slicing - Avoiding parsing rows which are going to be skipped [MC-format-only] " * 'sstable-mc-optimize-slicing-reads' of github.com:tgrabiec/scylla: sstables: mc: reader: Skip ignored rows before parsing them sstables: mc: reader: Call _cells.clear() when row ends rather than when it starts sstables: mc: mutation_fragment_filter: Take position_in_partition rather than a clustering_row sstables: mc: reader: Do not call consume_row_marker_and_tombstone() for static rows sstables: mc: parser: Allow the consumer to skip the whole row sstables: continuous_data_consumer: Introduce skip() sstables: continuous_data_consumer: Make position() meaningful inside state_processor::process_state() sstables: mc: parser: Allocate dynamic_bitset once per read instead of once per row sstables: reader: Do not read the head of the partition when index can be used sstables: mc: mutation_fragment_filter: Check the fast-forward window first sstables: mc: writer: Avoid calling unsigned_vint::serialized_size() (cherry picked from commit `e6d26a528f`)	2018-12-21 20:40:35 +02:00
Avi Kivity	b94997be0d	Merge " Extract MC sstable writer to a separate compilation unit" from Tomasz " The motivation is to keep code related to each format separate, to make it easier to comprehend and reduce incremental compilation times. Also reduces dependency on sstable writer code by removing writer bits from sstales.hh. The ka/la format writers are still left in sstables.cc, they could be also extracted. " * 'extract-sstable-writer-code' of github.com:tgrabiec/scylla: sstables: Make variadic write() not picked on substitution error sstables: Extract MC format writer to mc/writer.cc sstables: Extract maybe_add_summary_entry() out of components_writer sstables: Publish functions used by writers in writer.hh sstables: Move common write functions to writer.hh sstables: Extract sstable_writer_impl to a header sstables: Do not include writer.hh from sstables.hh sstables: mc: Extract bound_kind_m related stuff into mc/types.hh sstables: types: Extract sstable_enabled_features::all() sstables: Move components_writer to .cc tests: sstable_datafile_test: Avoid dependency on components_writer (cherry picked from commit `b023e8b45d`)	2018-12-21 20:40:35 +02:00
Benny Halevy	d3a5b10cb8	sstables_stats: writer_impl: move common members to base class To be used by sstable_writer for stats collection. Note that this patch is factored out so it can be verified with no other change in functionality. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit `6853c1677d`)	2018-12-21 20:40:35 +02:00
Benny Halevy	48f3f899ac	sstable: make write_crc, write_digest, and new_sstable_component_file private methods Prepare for per-sstable sub directory. Also, these functions get most of their parameters from the sst at hand so they might as well be first class members. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> (cherry picked from commit `ad5f1e4fbb`)	2018-12-21 20:40:35 +02:00
Paweł Dziepak	c4f745276c	Merge "Optimize sstable writing of large partitions" from Tomasz " This series contains several optimizations of the MC format sstable writer, mainly: - Avoiding output_stream when serializing into memory (e.g. a row) - Faster serialization of primitive types when serializing into memory I measured the improvement in throughput (frag/s) using perf_fast_forward for datasets with a single large partition with many small rows: - 10% for a row with a single cell of 8 bytes - 10% for a row with a single cell of 100 bytes - 9% for a row with a single cell of 1000 bytes - 13% for a row with 6 cells of 100 bytes " * tag 'avoid-output-stream-in-sstable-writer-v2' of github.com:tgrabiec/scylla: bytes_ostream: Optimize writing of fixed-size types sstables: mc: Write temporary data to bytes_ostream rather than file_writer sstables: mc: Avoid double-serialization of a range tombstone marker sstables: file_writer: Generalize bytes& writer to accept bytes_view sstables: Templetize write() functions on the writer sstables: Turn m_format_write_helpers.cc into an impl header sstables: De-futurize file_writer bytes_ostream: Implement clear() bytes_ostream: Make initial chunk size configurable (cherry picked from commit `e3f53542c9`)	2018-12-21 20:40:35 +02:00
Hagit Segev	392c7dee3c	release: prepare for 3.0-rc3	2018-12-21 20:19:50 +02:00
Gleb Natapov	04e982f909	streaming: hold to sink while close() is running and call close on error as well Currently if something throws while streaming in mutation sending loop sink is not closed. Also when close() is running the code does not hold onto sink object. close() is async, so sink should be kept alive until it completes. The patch uses do_with() to hold onto sink while close is running and run close() on error path too. Fixes #4004. Message-Id: <20181220155931.GL3075@scylladb.com> (cherry picked from commit `393269d34b`)	2018-12-20 19:50:47 +02:00
Amnon Heiman	97a8cc149e	node_exporter_install: switch to node_exporter 0.17 The newer version of node_exporter comes with important bug fixes, that is especially important for I3.metal is not supported with the older version of node_exporter. The dashboards can now support both the new and the old version of node_exporter. Fixes #3927 Signed-off-by: Amnon Heiman <amnon@scylladb.com> Message-Id: <20181210085251.23312-1-amnon@scylladb.com> (cherry picked from commit `09c2b8b48a`)	2018-12-20 19:12:00 +02:00
Avi Kivity	dbe347811c	Merge "materialized views: Apply backpressure from view replicas" from Duarte " As the amount of pending view updates increases we know that there’s a mismatch between the rate at which the base receives writes and the rate at which the view retires them. We react by applying backpressure to decrease the rate of incoming base writes, allowing the slow view replicas to catch up. We want to delay the client’s next writes to a base replica and we use the base’s backlog of view updates to derive this delay. To validate this approach we tested a 3 node Scylla cluster on GCE, using n1-standard-4 instances with NVMEs. A loader running on a n1-standard-8 instance run cassandra-stress with 100 threads. With the delay function d(x) set to 1s, we see no base write timeouts. With the delay function as defined in the series, we see that backlogs stabilize at some (arbitrary) point, as predicted, but this stabilization co-exists with base write timeouts. However, the system overall behaves better than the current version, with the 100 view update limit, and also better than the version without such limit or any backpressure. More work is necessary to further stabilize the system. Namely, we want to keep delaying until we see the backlog is decreasing. This will require us to add more delay beyond the stabilization point, which in turn should minimize the base write timeouts, and will also minimize the amount of memory the backlog takes at each base replica. Design document: https://docs.google.com/document/d/1J6GeLBvN8_c3SbLVp8YsOXHcLc9nOLlRY7pC6MH3JWo Fixes #2538 " Reviewed-by: Nadav Har'El <nyh@scylladb.com> * 'materialized-views/backpressure/v2' of https://github.com/duarten/scylla: (32 commits) service/storage_proxy: Release mutation as early as possible service/storage_proxy: Delay replica writes based on view update backlog service/storage_proxy: Get the backlog of a particular base replica service/storage_proxy: Add counters for delayed base writes main: Start and stop the view_update_backlog_broker service: Distribute a node's view update backlog service: Advertise view update backlog over gossip service/storage_proxy: Send view update backlog from replicas service/storage_proxy: Prepare to receive replica view update backlog service/storage_proxy: Expose local view update backlog tests/view_schema_test: Add simple test for db::view::node_update_backlog db/view: Introduce node_update_backlog class db/hints: Initialize current backlog database: Add counter for current view backlog database: Expose current memory view update backlog idl: Add db::view::update_backlog db/view: Add view_update_backlog database: Wait on view update semaphore for view building service/storage_proxy: Use near-infinite timeouts for view updates database: generate_and_propagate_view_updates no longer needs a timeout ... (cherry picked from commit `b66f59aa3d`)	2018-12-20 19:11:56 +02:00
Avi Kivity	8f2d24bb8f	config: remove "to be removed before release" notice mc sstable config The "enable_sstables_mc_format" config item help text wants to remove itself before release. Since scylla-3.0 did not get enough mc format mileage, we decided to leave it in, so the notice should be removed. Fixes #4003. Message-Id: <20181219082554.23923-1-avi@scylladb.com> (cherry picked from commit `dd51c659f7`)	2018-12-19 19:08:36 +02:00
Nadav Har'El	689e11c892	scylla.spec: add libatomic In some cases which I've yet to understand, build fails without libatomic. We need to add it to the mock build machine. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20181218154757.25236-1-nyh@scylladb.com>	2018-12-19 12:55:00 +00:00
Nadav Har'El	1766c793a8	build_rpm.sh: put temporary mock in build/, not /var/lib. build_rpm.sh uses "mock" to build an entire Scylla build environment, which easily spans more than 15 gigabytes. mock, by defaults, puts this build directory in a subdirectory of /var/lib/mock. There is no reason why temporary build products need to be in the root directory: Some machines (like mine) don't have that much free space in the root directory making it impossible to use this script on such machines. and it's too easy to leave temporary files there without noticing. With this patch, the mock directories are put in build/mock/ instead of /var/lib/mock. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20181217195952.15154-1-nyh@scylladb.com>	2018-12-19 12:54:37 +02:00
Avi Kivity	0b09008cde	Merge "Make sstable reader fail on unknown colum names in MC format" from Piotr " Before the reader was just ignoring such columns but this creates a risk of data loss. Refs #2598 " * 'haaawk/2598/v3' of github.com:scylladb/seastar-dev: sstables: Add test_sstable_reader_on_unknown_column sstables: Exception on sstable's column not present in schema sstables: store column name in column_translation::column_info sstables: Make test_dropped_column_handling test dropped columns (cherry picked from commit `b0cb69ec25`)	2018-12-18 16:23:51 +00:00
Piotr Jastrzebski	713e60f690	sstables: Extract mp_row_cosumer_m::check_schema_mismatch This method will contain common logic used in multiple places and reduce code duplication. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <bbda2f4ea4f9325055f096dc549f63b1bb03d3b6.1543311990.git.piotr@scylladb.com> (cherry picked from commit `4366302c4c`)	2018-12-18 16:23:47 +00:00
Paweł Dziepak	7b6841f947	Merge "Check for schema mismatch after dropping dead cells" from Piotr " Previously we were checking for schema incompatibility between current schema and sstable serialization header before reading any data. This isn't the best approach because data in sstable may be already irrelevant due to column drop for example. This patchset moves the check after actual data is read and verified that it has a timestamp new enough to classify it as nonobsolete. Fixes #3924 " * 'haaawk/3924/v3' of github.com:scylladb/seastar-dev: sstables: Enable test_schema_change for MC format sstables3: Throw error on schema mismatch only for live cells sstables: Pass column_info to consume_*_column sstables: Add schema_mismatch to column_info sstables: Store column data type in column_info sstables: Remove code duplication in column_translation (cherry picked from commit `62ea153629`)	2018-12-18 15:27:53 +00:00
Tomasz Grabiec	f124b7026f	Merge 'Add tests for schema changes' from Paweł This series adds a generic test for schema changes that generates various schema and data before and after an ALTER TABLE operation. It is then used to check correctness of mutation::upgrade() and sstable readers and lead to the discovery of #3924 and #3925. Fixes #3925. * https://github.com/pdziepak/scylla.git schema-change-test/v3.1 schema_builder: make member function names less confusing converting_mutation_partition_applier: fix collection type changes converting_mutation_partition_applier: do not emit empty collections sstable: use format() instead of sprint() tests/random-utils: make functions and variables inline tests: add models for schemas and data tests: generate schema changes tests/mutation: add test for schema changes tests/sstable: add test for schema changes (cherry picked from commit `564b328b2e`)	2018-12-18 14:57:50 +00:00
Avi Kivity	28cca751d1	Merge "Don't binary compare compressed sstables in test_write_many_partitions_* tests" from Piotr " Compression is not deterministic so instead of binary comparing the sstable files we just read data back and make sure everything that was written down is still present. Tests: unit(release) " * 'haaawk/binary-compare-of-compressed-sstables/v3' of github.com:scylladb/seastar-dev: sstables: Remove compressed parameter from get_write_test_path sstables: Remove unused sstable test files sstables: Ensure compare_sstables isn't used for compressed files sstables: Don't binary compare compressed sstables sstables: Remove debug printout from test_write_many_partitions (cherry picked from commit `1ff6b8fb96`)	2018-12-18 14:53:52 +00:00
Duarte Nunes	21d08aa41e	Merge 'Fix evictable shard reader related issues' from Botond " Recently some additional issues were discovered related to recent changes to the way inactive readers are evicted and making shard readers evictable. One such issue is that the `querier_cache` is not prepared for the querier to be immediately evicted by the reader concurrency semaphore, when registered with it as an inactive read (#3987). The other issue is that the multishard mutation query code was not fully prepared for evicted shard readers being re-created, or failing why being re-created (#3991). This series fixes both of these issues and adds a unit test which covers the second one. I am working on a unit test which would cover the second issue, but it's proving to be a difficult one and I don't want to delay the fixes for these issues any longer as they also affect 3.0. Fixes: #3987 Fixes: #3991 " * 'evictable-reader-related-issues/branch-3.0/v1' of https://github.com/denesb/scylla: multishard_mutation_query: reset failed readers to inexistent state multishard_mutation_query: handle missing readers when dismantling multishard_mutation_query: add support for keeping stats for discarded partitions multishard_mutation_query: expect evicted reader state when creating reader multishard_mutation_query: pretty-print the reader state in log messages querier_cache: check that the query wasn't evicted during registering reader_concurrency_semaphore: use the correct types in the constructor reader_concurrency_semaphore: add consume_resources() reader_concurrency_semaphore::inactive_read_handle: add operator bool()	2018-12-18 13:36:42 +00:00
Botond Dénes	f0b5170fa6	multishard_mutation_query: reset failed readers to inexistent state When attempting to dismantling readers, some of the to-be-dismantled readers might be in a failed state. The code waiting on the reader to stop is expecting failures, however it didn't do anything besides logging the failure and bumping a counter. Code in the lower layers did not know how to deal with a failed reader and would trigger `std::bad_variant_access` when trying to process (save or cleanup) it. To prevent this, reset the state of failed readers to `inexistent_state` so code in the lower layers doesn't attempt to further process them. (cherry picked from commit `b4c3aab4a7`)	2018-12-18 14:46:56 +02:00
Botond Dénes	3b617e873c	multishard_mutation_query: handle missing readers when dismantling When dismantling the combined buffer and the compaction state we are no longer guaranteed to have the reader each partition originated from. The reader might have been evicted and not resumed, or resuming it might have failed. In any case we can no longer assume the originating reader of each partition will be present. If a reader no longer exists, discard the partitions that it emitted. (cherry picked from commit `9cef043841`)	2018-12-18 14:46:56 +02:00
Botond Dénes	4eb9836e64	multishard_mutation_query: add support for keeping stats for discarded partitions In the next patches we will add code that will have to discard some of the dismantled partitions/fragments/bytes. Prepare the `dismantle_buffer_stats` struct for being able to track the discarded partitions/fragments/bytes in addition to those that were successfully dismantled. (cherry picked from commit `438bef333b`)	2018-12-18 14:46:56 +02:00
Botond Dénes	46af353209	multishard_mutation_query: expect evicted reader state when creating reader Previously readers were created once, so `make_remote_reader()` had a validation to ensure readers were not attempted at being created more than once. This validation was done by checking that the reader-state is either `inexistent` or `successful_lookup`. However with the introduction of pausing shard readers, it is now possible that a reader will have to be created and then re-created several times, however this validation was not updated to expect this. Update the validation so it also expects the reader-state to be `evicted`, the state the reader will be if it was evicted while paused. (cherry picked from commit `ce52436af4`)	2018-12-18 14:46:54 +02:00
Botond Dénes	76f70c676e	multishard_mutation_query: pretty-print the reader state in log messages (cherry picked from commit `1effb1995b`)	2018-12-18 14:34:33 +02:00
Botond Dénes	afc9f0e177	querier_cache: check that the query wasn't evicted during registering The reader concurrency semaphore can evict the querier when it is registered as an inactive read. Make the `querier_cache` aware of this so that it doesn't continue to process the inserted querier when this happens. Also add a unit test for this. (cherry picked from commit `5780f2ce7a`)	2018-12-18 14:34:33 +02:00
Botond Dénes	c899191ad5	reader_concurrency_semaphore: use the correct types in the constructor Previously there was a type mismatch for `count` and `memory`, between the actual type used to store them in the class (signed) and the type of the parameters in the constructor (unsigned). Although negative numbers are completely valid for these members, initializing them to negative numbers don't make sense, this is why they used unsigned types in the constructor. This restriction can backfire however when someone intends to give these parameters the maximum possible value, which, when interpreted as a signed value will be `-1`. What's worse the caller might not even be aware of this unsigned->signed conversion and be very suprised when they find out. So to prevent surprises, expose the real type of these members, trusting the clients of knowing what they are doing. Also add a `no_limits` constructor, so clients don't have to make sure they don't overflow internal types. (cherry picked from commit `e1d8237e6b`)	2018-12-18 14:34:33 +02:00
Botond Dénes	a3563e5f7d	reader_concurrency_semaphore: add consume_resources() (cherry picked from commit `dfd649a6b4`)	2018-12-18 14:34:33 +02:00
Botond Dénes	78c5b09694	reader_concurrency_semaphore::inactive_read_handle: add operator bool() (cherry picked from commit `21b44adbfe`)	2018-12-18 14:34:33 +02:00
Amnon Heiman	a51878205a	node-exporter.service: Update command line to fix service startup The upgrade to node_exporter 0.17 commit `09c2b8b48a` ("node_exporter_install: switch to node_exporter 0.17") caused the service to no longer start. Turns out node_exported broke backwards compatibility of the command line between 0.15 to 0.16. Fix it up. While fixing the command line, all the collector that are enabled by default were removed. Fixes #3989 Signed-off-by: Amnon Heiman <amnon@scylladb.com> [ penberg@scylladb.com: edit commit message ] Message-Id: <20181213114831.27216-1-amnon@scylladb.com> (cherry picked from commit `571755e117`)	2018-12-18 08:33:04 +02:00
Avi Kivity	46efc08882	Update seastar submodule * seastar 6700dc3...08f1258 (1): > reactor: disable nowait aio due to a kernel bug Fixes #3996.	2018-12-17 17:00:14 +02:00
Avi Kivity	c95433c967	Update seastar submodule * seastar 1651a2a...6700dc3 (3): > build: link against libatomic > core/semaphore: Allow combining semaphore_units() > core/shared_ptr: Allow releasing a lw_shared_ptr to a non-const object Fixes #3996.	2018-12-17 15:53:01 +02:00
Piotr Sarna	df3b6fb4a8	cql3: refuse to create index on COMPACT STORAGE with ck To follow C* compatibility, creating an index on COMPACT STORAGE table should be disallowed not only on base primary keys, but also when the base table contains clustering keys. Message-Id: <ab40c39730aff2e164d11ee5159ff62b8ec9e8e8.1544698186.git.sarna@scylladb.com> (cherry picked from commit `6743af5dbd`)	2018-12-17 09:45:43 +02:00
Piotr Sarna	44ee43bb17	cql3: add refusing to create an index on static column Secondary indexes on static columns are not yet supported, so creating such index should return an appropriate error. Fixes #3993 Message-Id: <700b0a71e80da52d2d5250edacc12626b55681fa.1544785127.git.sarna@scylladb.com> (cherry picked from commit `63bd43e57e`)	2018-12-17 09:44:52 +02:00
Asias He	aac363ca86	storage_service: Notify NEW_NODE only when a node is new node This is a backport of CASSANDRA-11038. Before this, a restarted node will be reported as new node with NEW_NODE cql notification. To fix, only send NEW_NODE notification when the node was not part of the cluster Fixes: #3979 Tests: pushed_notifications_test.py:TestPushedNotifications.restart_node_test Message-Id: <453d750b98b5af510c4637db25b629f07dd90140.1544583244.git.asias@scylladb.com> (cherry picked from commit `71c1681f6c`)	2018-12-16 13:59:19 +02:00
Duarte Nunes	9e6cc5b024	service/storage_proxy: Embed the expire timer in the response handler Embedding the expire timer for a write response in the abstract_write_response_handler simplifies the code as it allows removing the rh_entry type. It will also make the timeout easily accessible inside the handler, for future patches. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20181213111818.39983-1-duarte@scylladb.com> (cherry picked from commit `f8878238ed`)	2018-12-13 13:24:09 +00:00
Duarte Nunes	13b72c7b92	Merge branch 'gossip: Send node UP event to cql client after cql server is up' from Asias " This is a backport of CASSANDRA-8236. Before this patch, scylla sends the node UP event to cql client when it sees a new node joins the cluster, i.e., when a new node's status becomes NORMAL. The problem is, at this time, the cql server might not be ready yet. Once the client receives the UP event, it tries to connect to the new node's cql port and fails. To fix, a new application_sate::RPC_READY is introduced, new node sets RPC_READY to false when it starts gossip in the very beginning and sets RPC_READY to true when the cql server is ready. The RPC_READY is a bad name but I think it is better to follow Cassandra. Nodes with or without this patch are supposed to work together with no problem. Refs #3843 " * 'asias/node_up_down.upstream.v4.1' of github.com:scylladb/seastar-dev: storage_service: Use cql_ready facility storage_service: Handle application_state::RPC_READY storage_service: Add notify_cql_change storage_service: Add debug log in notify_joined storage_service: Add extra check in notify_joined storage_service: Add notify_joined storage_service: Add debug log in notify_up storage_service: Add extra check in notify_up storage_service: Add notify_up storage_service: Make notify_left log debug level storage_service: Introduce notify_left storage_service: Add debug log in notify_down storage_service: Introduce notify_down storage_service: Add set_cql_ready gossip: Add gossiper::is_cql_ready gms: Add endpoint_state::is_cql_ready gms: Add application_state::RPC_READY gms: Introduce cql_ready in versioned_value (cherry picked from commit `a42b2895c2`)	2018-12-13 12:06:59 +00:00
Avi Kivity	6b011fbe0a	build: pass C compiler configuration in dist package build Just like we allow customizing the C++ compiler, we should allow customizing the C compiler. Ref #3978 Message-Id: <20181211172821.30830-1-avi@scylladb.com> (cherry picked from commit `fa96e07e6b`) scylla-3.0.rc2	2018-12-12 14:41:38 +02:00
Tomasz Grabiec	9dd4e1b01f	sstables: index_reader: Avoid schema copy in advance_to() Introduced in `7e15e43`. Exposed by perf_fast_forward: running: large-partition-skips on dataset large-part-ds1 Testing scanning large partition with skips. Reads whole range interleaving reads with skips according to read-skip pattern: read skip time (s) frags frag/s (...) 1 0 5.268780 8000000 1518378 1 1 31.695985 4000000 126199 Message-Id: <1544614272-21970-1-git-send-email-tgrabiec@scylladb.com> (cherry picked from commit `0a853b8866`)	2018-12-12 14:38:49 +02:00
Nadav Har'El	e91c741ef5	secondary indexes: fail attempts to create a CUSTOM INDEX Cassandra supports a "CREATE CUSTOM INDEX" to create a secondary index with a custom implementation. The only custom implementation that Cassandra supports is SASI. But Scylla doesn't support this, or any other custom index implementation. If a CREATE CUSTOM INDEX statement is used, we shouldn't silently ignore the "CUSTOM" tag, we should generate an error. This patch also includes a regression test that "CREATE CUSTOM INDEX" statements with valid syntax fail (before this patch, they succeeded). Fixes #3977 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20181211224545.18349-2-nyh@scylladb.com> (cherry picked from commit `a0379209e6`)	2018-12-12 00:32:35 +00:00
Nadav Har'El	b18e9e115d	Fix typo in error message Interestingly, this typo was copied from the original Cassandra source code :-) Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20181211224545.18349-1-nyh@scylladb.com> (cherry picked from commit `36db4fba23`)	2018-12-12 00:32:35 +00:00
Avi Kivity	0b86ab0d2a	build: build libdeflate with user selected C compiler If the user specified a C compiler, use it to build libdeflate. Fixes #3978. Message-Id: <20181211145604.14847-1-avi@scylladb.com> (cherry picked from commit `34a31a807d`)	2018-12-11 19:24:24 +02:00
Duarte Nunes	97cd9108d6	db/system_distributed_keyspace: Create the schema with min_timestamp Different nodes can concurrently create the distributed system keyspace on boot, before the "if not exists" clause can take effect. However, the resulting schema mutations will be different since different nodes use different timestamps. This patch forces the timestamps to be the same across all nodes, so we save some schema mismatches. This fixes a bug exposed by `ca5dfdf`, whereby the initialization of the distributed system keyspace is done before waiting for schema agreement. While waiting for schema agreement in storage_service::join_token_ring(), the node still hasn't joined the ring and schemas can't be pulled from it, so nodes can deadlock. A similar situation can happen between a seed node and a non-seed node, where the seed node progresses to a different "wait for schema agreement" barrier, but still can't make progress because it can't pull the schema from the non-seed node still trying to join the ring. Finally, it is assumed that changes to the schema of the current distributed system keyspace tables will be protected by a cluster feature and a subsequent schema synchronization, such that all nodes will be at a point where schemas can be transferred around. Fixes #3976 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20181211113407.20075-1-duarte@scylladb.com> (cherry picked from commit `89ae3fbf11`)	2018-12-11 14:53:30 +00:00

1 2 3 4 5 ...

16785 Commits