scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-01 20:46:56 +00:00

Author	SHA1	Message	Date
Glauber Costa	94e90d4a17	column_family: do not open code generation calculation We already have a function that wraps this, re-use it. This FIXME is still relevant, so just move it there. Let's not lose it. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-03-10 21:05:47 -05:00
Glauber Costa	46fdeec60a	colum_family: remove mutation_count We use memory usage as a threshold these days, and nowhere is _mutation_count checked. Get rid of it. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-03-10 21:05:47 -05:00
Gleb Natapov	16135c2084	make initialization run in a thread While looking at initialization code I felt like my head is going to explode. Moving initialization into a thread makes things a little bit better. Only lightly tested. Message-Id: <20160310163142.GE28529@scylladb.com>	2016-03-10 17:42:05 +01:00
Gleb Natapov	176aa25d35	fix developer-mode parameter application on SMP I am almost sure we want to apply it once on each shard, and not multiple times on a single shard. Message-Id: <20160310155804.GB28529@scylladb.com>	2016-03-10 17:17:48 +01:00
Pekka Enberg	97bef4fb7c	build: Fix http/http_response_parser.hh dependency Make sure http_response.hh that is pulled by locator/ec2_snitch.hh is built. The commit is similar to what commit `6ccf8f8` ("build: make sure to ask seastar to build http/request_parser.hh, and depend on it") did for request_parser.hh. Fixes the following build error on CentOS: In file included from ./locator/ec2_multi_region_snitch.hh:41:0, from locator/ec2_multi_region_snitch.cc:39: ./locator/ec2_snitch.hh:24:40: fatal error: http/http_response_parser.hh: No such file or directory Spotted by Shlomi. Message-Id: <1457612266-315-1-git-send-email-penberg@scylladb.com>	2016-03-10 14:46:41 +01:00
Gleb Natapov	51ca3122cf	cleanup forward declaration for key types Message-Id: <20160310075138.GC6117@scylladb.com>	2016-03-10 10:52:19 +01:00
Pekka Enberg	5dd1fda6cf	main: Initialize system keyspace earlier We start services like gossiper before system keyspace is initialized which means we can start writing too early. Shuffle code so that system keyspace is initialized earlier. Refs #1014 Message-Id: <1457593758-9444-1-git-send-email-penberg@scylladb.com>	2016-03-10 10:39:27 +01:00
Pekka Enberg	f2f35a2f50	Merge "fix shutdown and improve logging" from Asias "Fixes #1005 and probably fixes #1013."	2016-03-10 08:21:48 +02:00
Asias He	a9ec752939	streaming: Reduce STREAM_MUTATION error logging There might be larger number of STREAM_MUTATION inflight. Log one error per column_family per range to avoid spam the log.	2016-03-10 10:56:48 +08:00
Asias He	134b814cde	gossip: Log status info when stopping gossip	2016-03-10 10:56:48 +08:00
Asias He	7c4c99d7c7	streaming: Fix a log level in get_column_family_stores It is supposed to be debug level instead of info level.	2016-03-10 10:56:48 +08:00
Asias He	cb90ff2709	storage_service: Make decommission log info instead of debug level The log is just a few lines. It is very useful to tell which step fails in case of error when we do decommission.	2016-03-10 10:56:48 +08:00
Asias He	ed723665df	gossip: Do not stop gossip more than once If we do - Decommission a node - Stop a node we will shutdown gossip more than once in: - storage_service::decommission - storage_service::drain_on_shutdown Fix by checking if it is already stopped and back off if so.	2016-03-10 10:56:48 +08:00
Asias He	138c5f5834	storage_service: Do not stop messaging_service more than once If we do - Decommission a node - Stop a node we will shutdown messaging_service more than once in: - storage_service::decommission - storage_service::drain_on_shutdown Fixes #1005 Refs #1013 This fix a dtest failure in debug build. update_cluster_layout_tests.TestUpdateClusterLayout.simple_decommission_node_1_test/ /data/jenkins/workspace/urchin-dtest/label/monster/mode/debug/scylla/seastar/core/future.hh:802:35: runtime error: member call on null pointer of type 'struct future_state' core/future.hh:334:49: runtime error: member access within null pointer of type 'const struct future_state' ASAN:SIGSEGV ================================================================= ==4557==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x00000065923e bp 0x7fbf6ffac430 sp 0x7fbf6ffac420 T0) #0 0x65923d in future_state<>::available() const /data/jenkins/workspace/urchin-dtest/label/monster/mode/debug/scylla/seastar/core/future.hh:334 #1 0x41458f1 in future<>::available() /data/jenkins/workspace/urchin-dtest/label/monster/mode/debug/scylla/seastar/core/future.hh:802 #2 0x41458f1 in then_wrapped<parallel_for_each(Iterator, Iterator, Func&&)::<lambda(parallel_for_each_state&)> [with Iterator = std::__detail::_Node_iterator<std::pair<const net::msg_addr, net::messaging_service::shard_info>, false, true>; Func = net::messaging_service::stop()::<lambda(auto:39&)> [with auto:39 = std::unordered_map<net::msg_addr, net::messaging_service::shard_info, net::msg_addr::hash>]::<lambda(std::pair<const net::msg_addr, net::messaging_service::shard_info>&)>]::<lambda(future<>)>, future<> > /data/jenkins/workspace/urchin-dtest/label/monster/mode/debug/scylla/seastar/core/future.hh:878	2016-03-10 10:56:48 +08:00
Tomasz Grabiec	838a038cbd	log: Fix operator<<(std::ostream&, const std::exception_ptr&) Attempt to print std::nested_exception currently results in exception to leak outside the printer. Fix by capturing all exception in the final catch block. For nested exception, the logger will print now just "std::nested_exception". For nested exceptions specifically we should log more, but that is a separate problem to solve. Message-Id: <1457532215-7498-1-git-send-email-tgrabiec@scylladb.com>	2016-03-09 16:05:03 +02:00
Pekka Enberg	2566f8dc18	configure: Remove 'scylla_libs' variable It's not actually used by anyone so drop it. Message-Id: <1457531753-27891-2-git-send-email-penberg@scylladb.com>	2016-03-09 14:56:54 +01:00
Pekka Enberg	9bfb6a0c5b	configure: Add boost date_time library as a dependency It's needed to fix the debug build. Message-Id: <1457531753-27891-1-git-send-email-penberg@scylladb.com>	2016-03-09 14:56:51 +01:00
Takuya ASADA	0ab3d0fd52	dist: use SEASTAR_IO instead of SCYLLA_IO sync with iotune, fixes #1010 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1457530910-1273-1-git-send-email-syuu@scylladb.com>	2016-03-09 15:45:34 +02:00
Gleb Natapov	f242c6395c	storage_proxy: add counter for retries reads Message-Id: <20160309130453.GF2253@scylladb.com>	2016-03-09 14:09:42 +01:00
Pekka Enberg	ab502bcfa8	types: Implement to_string for timestamps and dates The to_string() function is used for logging purpose so use boost to_iso_extended_string() to format both timestamps and dates. Fixes #968 (showstopper) Message-Id: <1457528755-6164-1-git-send-email-penberg@scylladb.com>	2016-03-09 14:08:33 +01:00
Pekka Enberg	8eedaca948	Merge "streaming: handle cf is deleted" from Asias "Fixes #979 Fixes #976"	2016-03-09 14:52:27 +02:00
Asias He	3a4ea227d8	storage_service: Fix effective_ownership Now, get_ranges_for_endpoint will unwrap the first range. With t0 t1 t2 t3, the first range (t3,t0] will be splitted as (min,t0] and (t3,max]. Skippping the range (t3,max] we will get the correct ownership number as if the first range were not splitted. Fixes #928 Message-Id: <2e30ebd53f3dba3cc5e0cf36d5541c354b0e30ca.1457506704.git.asias@scylladb.com>	2016-03-09 13:26:01 +01:00
Asias He	d9ead889f3	streaming: Handle cf is deleted when sending STREAM_MUTATION_DONE In the preparation phase of streaming, we check that remote node has all the cf_id which are needed for the entire streaming process, including the cf_id which local node will send to remote node and wise versa. So, at later time, if the cf_id is missing, it must be that the cf_id is deleted. It is fine to ingore no_such_column_family exception. In this patch, we change the code to ignore at server side to avoid sending the exception back, to avoid handle exception in an IDL compatiable way. One thing we can improve is that the sender might know the cf is deleted later than the receiver does. In this case, the sender will send some more mutations if we send back the no_such_column_family back to the sender. However, since we do not throw exceptions in the receiver stream mutation handler, it will not cause a lot of overhead, the receiver will just ignore the mutation received. Fixes #979	2016-03-09 16:50:38 +08:00
Asias He	efa74dbae0	streaming: Do not send if the cf is deleted It is possible that a cf is deleted after we make the cf reader. Avoid sending them to avoid the unnecessary overhead to send them on the wire and the peer node to drop the received mutations.	2016-03-09 16:50:38 +08:00
Asias He	4abaacfc61	db: Introduce column_family_exists It is cheaper than throwing a no_such_column_family exception to test if a cf is gone, e.g., deleted.	2016-03-09 16:50:38 +08:00
Asias He	dca9e594cc	streaming: Remove the unused test code It is introduced in the early development of streaming. We have dtest for streaming now, drop it. Message-Id: <1457499303-21163-1-git-send-email-asias@scylladb.com>	2016-03-09 10:31:42 +02:00
Pekka Enberg	4f3d6977f1	Merge "Abort stream_session if peer is removed or restarted" from Asias "Hook streaming with gossip callback so we can abort the stream_session in such case: - a node is restarted - a node is removed from the cluster Fixes #1001."	2016-03-09 10:18:42 +02:00
Nadav Har'El	2f56577794	sstables: more efficient read of compressed data file Before this patch, reading large ranges from a compressed data file involved two inefficiencies: 1. The compressed data file was read one compressed chunk at a time. Such a chunk is around 30 KB in size, well below our desired sstable read-ahead size (sstable_buffer_size = 128 KB). 2. Because the compressed chunks have variable length (the uncompressed chunk has a fixed length) they are not aligned to disk blocks, so consecutive chunks have overlapping blocks which were unnecessarily read twice. The fix for both issues is to build the compressed_file_input_stream on an existing file_input_stream, instead of using direct file IO to read the individual chunks. file_input_stream takes care of doing the appropriate amount of read-ahead, and the compressed_file_input_stream layer does the decompression of the data read from the underlying layer. Fixes #992. Historical note: Implementing compressed_file_input_stream on top of file_input_stream was already tried in the past, and rejected. The problem at that time was that compressed_file_input_stream's constructor did not specify the end of the range to read, so that when we wanted to read only a small range we got too much read-ahead beyond the exactly one compressed chunk that we needed to read. Following the fix to issue #964, we now know on every streaming read also the intended end of the stream, so we can now use this to stop reading at the end of the last required chunk, even when we use a read-ahead buffer much larger than a chunk. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <1457304335-8507-1-git-send-email-nyh@scylladb.com>	2016-03-09 10:14:15 +02:00
Glauber Costa	8260b8fc6f	touch CF directories during startup We try to be robust against files disappearing (due to any kind of corruption) inside the data directory. But if the data directory itself goes missing, that's a situation that we don't handle correctly. We will keep accepting writes normally, but when we try to flush the memtable to disk, we'll fail with a system error. Having the CF directory disappearing is not a common thing. But it is also one that we can easily protect against, by touching all CF directories we know about on startup. Fixes #999 Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <ed66373dccca11742150a6d08e21ece3980227d3.1457379853.git.glauber@scylladb.com>	2016-03-09 09:06:51 +02:00
Asias He	bf3507d093	messaging_service: Stop retrying if node is removed from gossip - Start a node - Inject data - Start another node to bootstrap - Before the second node finishes streaming, kill the second node - After a while the node will be removed from the cluster becusue it does not manage to join the cluster. - At this time, messaging_service might keep retrying the stream_mutations unncessarily. To fix, check if the peer node is still a known node in the gossip.	2016-03-09 07:35:20 +08:00
Asias He	1f3928c321	streaming: Hook streaming with gossip callback If the peer node of a stream_session is restarted or removed we should abort the streaming. It is better to hook gossip callback in the stream manager than in each streamm_session.	2016-03-09 07:35:20 +08:00
Glauber Costa	2cd756ae5e	repair: replace a magic number with another magic number In due time we will have to fix this, but as an interim step, let's use a "better" magic number. The problem with 100, is that as soon as the partitions start to go bigger, we're using too much memory. Since this is multiplied by the number of token ranges, and happens in every shard, the final number can become really big, and the amount of resources we use go up proportionally. This means that even we are mistaken about the new number (we probably are), in this case it is better to err on the side of a more conservative resource usage. Reviewed-by: Nadav Har'El <nyh@scylladb.com> Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <97158f3db5734916cee4ccf12eaa66e7402570bb.1457448855.git.glauber@scylladb.com>	2016-03-08 17:29:00 +02:00
Nadav Har'El	b7e29691c2	sstables: avoid index and data file over-reads When we do a streaming read that knows the expected end position of the read, we can use a large read-ahead buffer, and at the same time, stop reading at exactly the intended end (or small rounding of it to the DMA block size) and not waste resources blindly reading a large amount of data after the end just to fill the read-ahead buffer. The sstable reading code, both for reading the data file and the index file, created a file input stream without specifiying its end, thereby losing this optimization - so when a large buffer was used, we would get a large over-read. This patch fixes this, so sstable data file and index file are read using a file input stream which is a ware of its end. Fixes #964. Note that this patch does not change the behavior when reading a compressed data file. For compressed read, we did not have the problem of over-read in the first place, because chunks are read one by one. But we do have other sources of inefficiencies there (stemming, again, from the fact that the compressed chunks are read one by one), and I opened a separate issue #992 for that. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <1457219304-12680-1-git-send-email-nyh@scylladb.com>	2016-03-08 17:26:10 +02:00
Calle Wilund	8575f1391f	lists.cc: fix update insert of frozen list Fixes #967 Frozen lists are just atomic cells. However, old code inserted the frozen data directly as an atomic_cell_or_collection, which in turn meant it lacked the header data of a cell. When in turn it was handled by internal serialization (freeze), since the schema said is was not a (non-frozen) collection, we tried to look at frozen list data as cell header -> most likely considered dead. Message-Id: <1457432538-28836-1-git-send-email-calle@scylladb.com>	2016-03-08 13:48:45 +01:00
Pekka Enberg	81af486b69	Update scylla-ami submodule * dist/ami/files/scylla-ami d4a0e18...84bcd0d (1): > Add --ami parameter	2016-03-08 13:49:31 +02:00
Takuya ASADA	254b0fa676	dist: show message to use XFS for scylla data directory and also notify about developer mode, when iotune fails Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1457426286-15925-1-git-send-email-syuu@scylladb.com>	2016-03-08 12:20:33 +02:00
Pekka Enberg	83d82ea901	Merge "Fix Ubuntu package issues on AMI" from Takuya "This fixes bugs on Ubuntu package and AMI scripts, closes #991."	2016-03-08 11:51:30 +02:00
Takuya ASADA	18a27de3c8	dist: export all entries on /etc/default/scylla-server on Ubuntu Signed-off-by: Takuya ASADA <syuu@scylladb.com>	2016-03-08 18:18:30 +09:00
Gleb Natapov	ce6d1a242a	storage_proxy: fix background_reads counter background_reads collectd counter was not always properly decremented. Fix it and streamline background read repair error handling. Message-Id: <20160307182255.GI4849@scylladb.com>	2016-03-07 19:41:09 +01:00
Yoav Kleinberger	1cd01cd2ab	tools/scyllatop: defend against curses "out of screen bounds" error Fixes issue #945 (hopefully) This issue was probably the result of trying to write outside the confines of the window. The views.Base class now defends against this. Signed-off-by: Yoav Kleinberger <yoav@scylladb.com> Message-Id: <9735806b211567f3239e187d87437c484f532291.1457265435.git.yoav@scylladb.com>	2016-03-07 18:02:26 +01:00
Raphael S. Carvalho	0f4239d63a	service: improve logging of storage_service::load_new_sstables Closes #952. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <2402f387c32d2d1221e740edb67e56c1593c1936.1457366098.git.raphaelsc@scylladb.com>	2016-03-07 18:01:52 +01:00
Raphael S. Carvalho	e850c1406e	sstables: update comment Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <8abc1c6c66ed8d3bb35ecfb6d8251de3f61a97ae.1457093016.git.raphaelsc@scylladb.com>	2016-03-07 17:36:34 +01:00
Raphael S. Carvalho	822759eee0	compaction_manager: update stat pending_tasks properly Size of both _cfs_to_cleanup and _cfs_to_compact must be added when calculating a new value to _stats.pending_tasks. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <b601e24d0631922798575f39d00fb54fe00d4971.1457093016.git.raphaelsc@scylladb.com>	2016-03-07 17:36:03 +01:00
Gleb Natapov	2d092bbd32	storage_proxy: send read requests with timeout No need to wait for replies long after request is timed out. Message-Id: <1457351304-28721-2-git-send-email-gleb@scylladb.com>	2016-03-07 14:00:11 +01:00
Gleb Natapov	4122422d19	storage_proxy: always wait for digest read resolver done future Currently it is waited upon only if background read repair check is needed and this cause unhandled exception warning to be printed if it enters failed state. Fix this by always waiting on it, but doing anything beyond ignoring an exception only if check is needed. Message-Id: <1457351304-28721-1-git-send-email-gleb@scylladb.com>	2016-03-07 14:00:09 +01:00
Gleb Natapov	626c9d046b	fix EACH_QUORUM handling during bootstrapping Currently write acknowledgements handling does not take bootstrapping node into account for CL=EACH_QUORUM. The patch fixes it. Fixes #994 Message-Id: <20160307121620.GR2253@scylladb.com>	2016-03-07 13:56:34 +01:00
Raphael S. Carvalho	d65642cee8	fix storage_service::load_new_sstables() to not disable write permanently Avi says: "If an exception happens, then enable_sstable_writes won't be called." The problem is fixed by catching a possible exception and enabling sstable write for the relevant column family if it wasn't enabled already. Closes #953. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <32c1bcb2c60c7b9e5514eb0a95062f40ca92093a.1457119308.git.raphaelsc@scylladb.com>	2016-03-07 13:56:02 +01:00
Gleb Natapov	f59415b3c6	Take pending endpoints into account while checking for sufficient live nodes During bootstrapping additional copies of data has to be made to ensure that CL level is met (see CASSANDRA-833 for details). Our code does that, but it does not take into account that bootstraping node can be dead which may cause request to proceed even though there is no enough live nodes for it to be completed. In such a case request neither completes nor timeouts, so it appear to be stuck from CQL layer POV. The patch fixes this by taking into account pending nodes while checking that there are enough sufficient live nodes for operation to proceed. Fixes #965 Message-Id: <20160303165250.GG2253@scylladb.com>	2016-03-07 13:30:13 +01:00
Gleb Natapov	8dad399256	log: add space between log level and date in the outpu It was dropped by `6dc51027a3` Message-Id: <20160306125313.GI2253@scylladb.com>	2016-03-07 13:06:06 +01:00
Tomasz Grabiec	9deb036e4e	Merge branch 'dev/issue-845-set-incremental-backup-config-v1' from seastar-dev.git From Vlad: This series modifies the 'database' class to use the internal _enable_incremental_backups value (initialized with 'incremental_backups' configuration value) instead of using the 'incremental_backups' configuration value directly. Then we update this internal value in runtime from 'nodetool enable/disablebackup' API callback so that newly created keyspaces and column families use the newly configured incremental backup configuration.	2016-03-07 10:47:20 +01:00

... 57 58 59 60 61 ...

11716 Commits