scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-31 03:56:42 +00:00

Author	SHA1	Message	Date
Lucas Meneghel Rodrigues	43d39d8b03	scylla_coredump_setup: Don't call yum on scylla server spec file The script scylla_coredump_setup was introduced in `9b4d0592`, and added to the scylla rpm spec file, as a post script. However, calling yum when there's one yum instance installng scylla server will cause a deadlock, since yum waits for the yum lock to be released, and the original yum process waits for the script to end. So let's remove this from the script. Debian shouldn't be affected, since it was never added to the debian build rules (to the best of my knowlege, after analyzing `9b4d0592`), hence I did not remove it. It should cause the same problem with apt-get in case it was used. CC: Takuya ASADA <syuu@scylladb.com> [ penberg: Rebase and drop statement about 'abrt' package not in Fedora. ] Signed-off-by: Lucas Meneghel Rodrigues <lmr@scylladb.com>	2015-12-30 09:38:36 +02:00
Nadav Har'El	ebebaa525d	repair: fix missing default values A default value was not set for the "incremental" and "parallelism" repair parameters, so Scylla can wrongly decide that they have an unsupported value. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2015-12-29 15:39:47 +02:00
Amnon Heiman	ec379649ea	API: repair to use documented params The repair API use to have an undocumented parameter list similiar to origin. This patch changes the way repair is getting its parameters. Instead of a one undocumented string it now lists all the different optional parameters in the swagger file and accept them explicitely. Reviewed-by: Nadav Har'El <nyh@scylladb.com> Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2015-12-29 15:38:44 +02:00
Amnon Heiman	f0d68e4161	main: start the http server in the first step This change set the http server to start as the first step in the boot order. It is helpfull if some other step takes a long time or stuck. Fixes #725 Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2015-12-29 14:20:57 +02:00
Avi Kivity	c8b09a69a9	lsa: disable constant_time_size in binomial_heap implementation Corrupts heap on boost < 1.60, and not needed. Fixes #698.	2015-12-29 12:59:00 +01:00
Vlad Zolotarov	756de38a9d	database: actually check that a snapshot directory exists Actually check that a snapshot directory with a given tag exists instead of just checking that a 'snapshot' directory exists. Fixes issue #689 Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2015-12-29 12:59:00 +01:00
Avi Kivity	41bd266ddd	db: provide more information on "Unrecognized error" while loading sstables This information can be used to understand the root cause of the failure. Refs #692.	2015-12-29 10:23:32 +02:00
Nadav Har'El	7247f055df	repair: partial support for some options Add partial support for the "incremental" option (only support the "false" setting, i.e., not incremental repair) and the "parallelism" option (the choice of sequential or parallel repair is ignored - we always use our own technique). This is needed because scylla-jmx passes these options by default (e.g., "incremental=false" is passed to say this is not incremental repair, and we just need to allow this and ignore it). Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2015-12-29 09:38:09 +02:00
Nadav Har'El	3cfa39e1f0	repair: log repair options When throwing an "unsupported repair options" exception to the caller (such as "nodetool repair"), also list which options were not recognized. Additionally, list the options when logging the repair operation. This patch includes an operator<< implementation for pretty-printing an std::unordered_map. We may want to move it later to a more central location - even Seastar (like we have a pretty-printer for std::vector in core/sstring.hh). Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2015-12-29 09:37:30 +02:00
Raphael S. Carvalho	b7d36af26f	compaction: fix max_purgeable calculation max_purgeable was being incorrectly calculated because the code that creates vector of uncompacted sstables was wrong. This value is used to determine whether or not a tombstone can be purged. Operand < is supposed to be used instead in the callback passed as third parameter to boost::set_difference. This fix is a step towards closing the issue #676. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2015-12-29 09:30:08 +02:00
Takuya ASADA	46767fcacf	dist: fix .rpm build error (File not found: scylla_extlinux_setup) Signed-off-by: Takuya ASADA <syuu@scylladb.com>	2015-12-29 09:26:58 +02:00
Pekka Enberg	ca1f9f1c9a	main: Fix implicitly disabled client encryption options The start_native_transport() function in storage_service expects the 'enabled' option to be defined. If the option is not defined, it means that encryption is implicitly disabled. Fixes #718.	2015-12-28 16:24:49 +02:00
Pekka Enberg	a76b3a009b	Merge "use steady_clock where monotonic clock is required" from Vlad "The first patch in this series fixes the issue #638 in scylla. The second one fixes the tests to use the appropriate clock."	2015-12-28 13:35:50 +02:00
Avi Kivity	561bb79d22	Merge "CQL server SSL" from Calle "* Update scylla.conf section * Add SSL capability to cql server * Use conf and initiate optional SSL cql server in main/storage_service"	2015-12-28 12:55:25 +02:00
Avi Kivity	72cb8d4461	Merge "Messaging service TLS" from Calle "Adds support for TLS/SSL encrypted (and cert verified) connections for message service * Modify config option to match "native" style cerificate management * Add SSL options to messaging service and generate SSL server/client endpoints when required * Add config option handling to init/main"	2015-12-28 12:54:28 +02:00
Calle Wilund	fae3bb7a24	storage_service: Set up CQL server as SSL if specified * Massage user options in main * Use them in storage_service, and if needed, load certificates etc and pass to transport/cql server. Conflicts: service/storage_service.cc	2015-12-28 10:13:48 +00:00
Calle Wilund	51d3990261	cql_server: Allow using SSL socket Optional credentials argument determine if SSL or normal server socket is created. Note: This does not follow the pattern of "socket as argument", simply because this is a distributed object, so only trivial or immutable objects should be passed to it.	2015-12-28 10:13:48 +00:00
Calle Wilund	d8b2581a07	scylla.conf: Update client_encryption_options with scylla syntax Using certificate+key directly	2015-12-28 10:13:48 +00:00
Calle Wilund	5f003f9284	scylla.conf: Modify server_encryption_options section Describe scylla version of option. Note, for test usage, the below should be workable: server_encryption_options: internode_encryption: all certificate: seastar/tests/test.crt truststore: seastar/tests/catest.pem keyfile: seastar/tests/test.key Since the seastar test suite contains a snakeoil cert + trust combo	2015-12-28 10:10:35 +00:00
Calle Wilund	70f293d82e	main/init: Use server_encryption_options * Reads server_encryption_options * Interpret the above, and load and initialize credentials and use with messaging service init if required	2015-12-28 10:10:35 +00:00
Calle Wilund	d1badfa108	messaging_service: Optionally create SSL endpoints * Accept port + credentials + option for what to encrypt * If set, enable a SSL listener at ssl_port * Check outgoing connections by IP to determine if they should go to SSL/normal endpoint Requires seastar RPC patch Note: currently, the connections created by messaging service does _not_ do certificate name verification. While DNS lookup is probably not that expensive here, I am not 100% sure it is the desired behaviour. Normal trust is however verified.	2015-12-28 10:10:35 +00:00
Calle Wilund	1a9fb4ed7f	config: Modify/use server_encryption_options * Mark option used * Make sub-options adapted to seastar-tls useable values (i.e. x509) Syntax is now: server_encryption_options: internode_encryption: <none, all, dc, rack> certificate: <path-to-PEM-x509-cert> (default conf/scylla.crt) keyfile: <path-to-PEM-x509-key> (default conf/scylla.key) truststore: <path-to-PEM-trust-store-file> (default empty, use system trust)	2015-12-28 10:10:35 +00:00
Calle Wilund	b7baa4d1f5	config: clean up some style + move method to cc file	2015-12-28 10:10:35 +00:00
Takuya ASADA	fc29a341d2	dist: show usage and scylla-server status when login to AMI instance Signed-off-by: Takuya ASADA <syuu@scylladb.com>	2015-12-28 11:40:34 +02:00
Avi Kivity	827a4d0010	Merge "streaming: Invalidate cache upon receiving of stream" from Asias "When a node gain or regain responsibility for certain token ranges, streaming will be performed, upon receiving of the stream data, the row cache is invalidated for that range. Refs #484."	2015-12-28 10:24:46 +02:00
Amnon Heiman	2c79fe1488	storage_service: describe_ring return full data The describe_ring method in storage_service did not report the start and end tokens. Also for rpc addresses that are not the local address, it returned the value representation (including the version) and not just the adress. Fixes #695 Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2015-12-28 09:56:12 +02:00
Takuya ASADA	0abcf5b3f3	dist: use readable time format on coredump file, instead of unix time Signed-off-by: Takuya ASADA <syuu@scylladb.com>	2015-12-28 09:55:05 +02:00
Takuya ASADA	940c34b896	dist: don't abort scylla_coredump_setup when 'yum remove abrt' failed It always fail when abrt is not installed. This also fixes build_ami.sh failing because of this error. Signed-off-by: Takuya ASADA <syuu@scylladb.com>	2015-12-28 09:40:57 +02:00
Vlad Zolotarov	0f8090d6c7	tests: use steady_clock where monotinic clock is required Use steady_clock instead of high_resolution_clock where monotonic clock is required. high_resolution_clock is essentially a system_clock (Wall Clock) therefore may not to be assumed monotonic since Wall Clock may move backwards due to time/date adjustments. Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2015-12-27 18:08:15 +02:00
Vlad Zolotarov	33552829b2	core: use steady_clock where monotinic clock is required Use steady_clock instead of high_resolution_clock where monotonic clock is required. high_resolution_clock is essentially a system_clock (Wall Clock) therefore may not to be assumed monotonic since Wall Clock may move backwards due to time/date adjustments. Fixes issue #638 Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>	2015-12-27 18:07:53 +02:00
Takuya ASADA	7f4a1567c6	dist: support non-ami boot parameter setup, add parameters for preallocate hugepages on boot-time Fixes #172 Signed-off-by: Takuya ASADA <syuu@scylladb.com>	2015-12-27 17:56:49 +02:00
Takuya ASADA	6bf602e435	dist: setup ntpd on AMI Signed-off-by: Takuya ASADA <syuu@scylladb.com>	2015-12-27 17:54:32 +02:00
Avi Kivity	2b22772e3c	Merge "Introduce keep alive timer for stream_session" from Asias "Fixes stream_session hangs: 1) if the sending node is gone, the receiving peer will wait forever 2) if the node which should send COMPLETE_MESSAGE to the peer node is gone, the peer node will wait forever"	2015-12-27 16:56:32 +02:00
Avi Kivity	f3980f1fad	Merge seastar upstream * seastar 51154f7...8b2171e (9): > memcached: avoid a collision of an expiration with time_point(-1). > tutorial: minor spelling corrections etc. > tutorial: expand semaphores section > Merge "Use steady_clock where monotonic clock is required" from Vlad > Merge "TLS fixes + RPC adaption" from Calle > do_with() optimization > tutorial: explain limiting parallelism using semaphores > submit_io: change pending flushes criteria > apps: remove defunct apps/seastar Adjust code to use steady_clock instead of high_resolution_clock.	2015-12-27 14:40:20 +02:00
Avi Kivity	0687d7401d	Merge "storage_service updates" from Asias " - Fix erase of new_replica_endpoints in get_changed_ranges_for_leaving - Introduce ntroduce ring_delay_ms option "	2015-12-27 12:46:37 +02:00
Nadav Har'El	06f8dd4eb2	repair: job id must start at 1 This patch fixes a bug where the first run of "nodetool repair" always returned immediately, instead of waiting for the repair to complete. Repair operations are asynchronous: Starting a repair returns a numeric id, which can then be used to query for the repair's completion, and this is what "nodetool repair" does (through our JMX layer). We started with the repair ID "0", the next one is "1", and so on. The problem is that "nodetool repair", when it sees 0 being returned, treats it not as a regular repair ID, but rather as an answer that there is nothing to repair - printing a message to that effect and not waiting for the repair (which was correctly started) to complete. The trivial fix is to start our repair IDs at 1, instead of 0. We currently do not return 0 in any case (we don't know there is nothing to repair before we actually start the work, and parameter errors cause an exception, not a return of 0). Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2015-12-27 12:42:26 +02:00
Avi Kivity	93aeedf403	Merge "Fixes for CentOS/RHEL support" from Takuya "Recent changes on scripts causes error on CentOS/RHEL, this patchset fixes it."	2015-12-27 12:21:29 +02:00
Glauber Costa	e299127e81	main: check if options file can be read. If we can't open the file, we will fail with a misterious error. It is a costumary scenario, though, since people who are unaware or have just forgotten about seastar's restriction of direct io access may put those files in tmpfs and other mount points. We have a direct_io check that is designed exactly for this purpose, so as to give the user a better error message. This patch makes use of it. Fixes #644 Signed-off-by: Glauber Costa <glauber@scylladb.com>	2015-12-27 12:20:40 +02:00
Asias He	f57ba6902b	storage_service: Introduce ring_delay_ms option It is hard-coded as 30 seconds at the moment. Usage: $ scylla --ring-delay-ms 5000 Time a node waits to hear from other nodes before joining the ring in milliseconds. Same as -Dcassandra.ring_delay_ms in cassandra.	2015-12-25 15:08:22 +08:00
Asias He	9c07ed8db6	storage_service: Fix erase new_replica_endpoints in get_changed_ranges_for_leaving We need to calculate begin() and end() in the loop since elements in new_replica_endpoints might be removed. Refs #700	2015-12-25 15:08:22 +08:00
Asias He	88846bc816	storage_service: Add more debug info in decommission It is useful to debug decommission issue.	2015-12-25 15:08:22 +08:00
Asias He	19f1875682	gossip: Print endpoint_state_map debug info in trace level This generates too many logs with debug level. Make it trace level.	2015-12-25 15:08:22 +08:00
Nadav Har'El	06ab43a7ee	murmur3 partitioner: fix midpoint() algorithm The midpoint() algorithm to find a token between two tokens doesn't work correctly in case of wraparound. The code tried to handle this case, but did it wrong. So this patch fixes the midpoint() algorithm, and adds clearer comments about why the fixed algorithm is correct. This patch also modifies two midpoint() tests in partitioner_test, which were incorrect - they verified that midpoint() returns some expected values, but expected values were wrong! We also add to the test a more fundemental test of midpoint() correctness, which doesn't check the midpoint against a known value (which is easy to get wrong, like indeed happened); Rather we simply check that the midpoint is really inside the range (according to the token ordering operator). This simple test failed with the old implementation of midpoint() and passes with the new one. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2015-12-24 17:19:49 +02:00
Avi Kivity	3392f02b54	Merge "Make date parser more liberal" from Paweł "This series makes date and time parsing more liberal so that Scylla accepts the same date formats the origin does. Fixes #521."	2015-12-24 17:18:04 +02:00
Asias He	20c258f202	streaming: Fix session hang with maybe_completed: WAIT_COMPLETE -> WAIT_COMPLETE The problem is that we set the session state to WAIT_COMPLETE in send_complete_message's continuation, the peer node might send COMPLETE_MESSAGE before we run the continuation, thus we set the wrong status in COMPLETE_MESSAGE's handler and will not close the session. Before: GOT STREAM_MUTATION_DONE receive task_completed SEND COMPLETE_MESSAGE to 127.0.0.2:0 GOT COMPLETE_MESSAGE, from=127.0.0.2, connecting=127.0.0.3, dst_cpu_id=0 complete: PREPARING -> WAIT_COMPLETE GOT COMPLETE_MESSAGE Reply maybe_completed: WAIT_COMPLETE -> WAIT_COMPLETE After: GOT STREAM_MUTATION_DONE receive task_completed maybe_completed: PREPARING -> WAIT_COMPLETE SEND COMPLETE_MESSAGE to 127.0.0.2:0 GOT COMPLETE_MESSAGE, from=127.0.0.2, connecting=127.0.0.3, dst_cpu_id=0 complete: WAIT_COMPLETE -> COMPLETE Session with 127.0.0.2 is complete	2015-12-24 20:34:44 +08:00
Asias He	c971fad618	streaming: Introduce keep alive timer for each stream_session If the session is idle for 10 minutes, close the session. This can detect the following hangs: 1) if the sending node is gone, the receiving peer will wait forever 2) if the node which should send COMPLETE_MESSAGE to the peer node is gone, the peer node will wait forever Fixes simple_kill_streaming_node_while_bootstrapping_test.	2015-12-24 20:34:44 +08:00
Asias He	f527e07be6	streaming: Get stream_session in STREAM_MUTATION handler Get from address from cinfo. It is needed to figure out which stream session this mutation is belonged to, since we need to update the keep alive timer for this stream session.	2015-12-24 20:34:44 +08:00
Asias He	d7a8c655a6	streaming: Print All sessions completed after state change message close_session will print "All sessions completed" message, print the state change message before that.	2015-12-24 20:34:44 +08:00
Asias He	bd276fd087	streaming: Increase retry timeout Currently, if the node is actually down, although the streaming_timeout is 10 seconds, the sending of the verb will return rpc_closed error immediately, so we give up in 20 * 5 = 100 seconds. After this change, we give up in 10 * 30 = 300 seconds at least, and 10 * (30 + 30) = 600 seconds at most.	2015-12-24 20:34:44 +08:00
Asias He	eaea09ee71	streaming: Retransmit COMPLETE_MESSAGE message It is oneway message at the moment. If a COMPLETE_MESSAGE is lost, no one will close the session. The first step to fix the issue is to try to retransmit the message.	2015-12-24 20:34:44 +08:00

1 2 3 4 5 ...

7811 Commits