scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-27 11:55:15 +00:00

Author	SHA1	Message	Date
Tomasz Grabiec	ec1fd3945f	Revert "config: adjust boost::program_options validator to work with db::string_map" This reverts commit `653e250d04`. Compiletion is broken with this patch: [155/264] CXX build/release/db/config.o FAILED: g++ -MMD -MT build/release/db/config.o -MF build/release/db/config.o.d -std=gnu++1y -g -Wall -Werror -fvisibility=hidden -pthread -I/home/shlomi/scylla/seastar -I/home/shlomi/scylla/seastar/build/release/gen -march=nehalem -Wno-overloaded-virtual -DHAVE_HWLOC -DHAVE_NUMA -O2 -I/usr/include/jsoncpp/ -Wno-maybe-uninitialized -DHAVE_LIBSYSTEMD=1 -I. -I build/release/gen -I seastar -I seastar/build/release/gen -c -o build/release/db/config.o db/config.cc db/config.cc:57:13: error: ‘void db::validate(boost::any&, const std::vector<std::__cxx11::basic_string<char> >&, db::string_map*, int)’ defined but not used [-Werror=unused-function] static void validate(boost::any& out, const std::vector<std::string>& in, ^ cc1plus: all warnings being treated as errors This branch doesn't have commits which introduce the problem which this patch fixes, so let's just revert it.	2016-06-08 11:05:47 +02:00
Gleb Natapov	653e250d04	config: adjust boost::program_options validator to work with db::string_map Fixes #1320 Message-Id: <20160607064511.GX9939@scylladb.com> (cherry picked from commit `9635e67a84`)	2016-06-07 10:43:30 +03:00
Amnon Heiman	6255076c20	rate_moving_average: mean_rate is not initilized The rate_moving_average is used by timed_rate_moving_average to return its internal values. If there are no timed event, the mean_rate is not propertly initilized. To solve that the mean_rate is now initilized to 0 in the structure definition. Refs #1306 Signed-off-by: Amnon Heiman <amnon@scylladb.com> Message-Id: <1465231006-7081-1-git-send-email-amnon@scylladb.com> (cherry picked from commit `2cf882c365`)	2016-06-07 09:44:26 +03:00
Pekka Enberg	420ebe28fd	release: prepare for 1.2.rc2	2016-06-06 16:17:26 +03:00
Avi Kivity	a6179476c5	Be more conservative when deciding when to shut down due to disk errors Currently we only shut down on EIO. Expand this to shut down on any system_error. This may cause us to shut down prematurely due to a transient error, but this is better than not shutting down due to a permanent error (such as ENOSPC or EPERM). We may whitelist certain errors in the future to improve the behavior. Fixes #1311. Message-Id: <1465136956-1352-1-git-send-email-avi@scylladb.com> (cherry picked from commit `961e80ab74`)	2016-06-06 16:15:25 +03:00
Raphael S. Carvalho	342726a23c	compaction: leveled: improve log message for overlapping table Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <2dcbe3c8131f1d88a3536daa0b6cdd25c6e41d76.1464883077.git.raphaelsc@scylladb.com> (cherry picked from commit `17b56eb459`)	2016-06-06 16:13:40 +03:00
Raphael S. Carvalho	e9946032f4	compaction: disable parallel compaction for leveled strategy It was discussed that leveled strategy may not benefit from parallel compaction feature because almost all compaction jobs will have similar size. It was also found that leveled strategy wasn't working correctly with it because two overlapping sstable (targetting the same level) could be created in parallel by two ongoing compaction. Fixes #1293. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <60fe165d611c0283ca203c6d3aa2662ab091e363.1464883077.git.raphaelsc@scylladb.com> (cherry picked from commit `588ce915d6`)	2016-06-06 16:13:36 +03:00
Pekka Enberg	5e0b113732	Update scylla-ami submodule * dist/ami/files/scylla-ami 72ae258...863cc45 (3): > Move --cpuset/--smp parameter settings from scylla_sysconfig_setup to scylla_ami_setup > convert scylla_install_ami to bash script > 'sh -x -e' is not valid since all scripts converted to bash script, so remove them	2016-06-06 13:38:53 +03:00
Asias He	c70faa4f23	streaming: Reduce memory usage when sending mutations Limit disk bandwidth to 5MB/s to emulate a slow disk: echo "8:0 5000000" > /cgroup/blkio/limit/blkio.throttle.write_bps_device echo "8:0 5000000" > /cgroup/blkio/limit/blkio.throttle.read_bps_device Start scylla node 1 with low memory: scylla -c 1 -m 128M --auto-bootstrap false Run c-s: taskset -c 7 cassandra-stress write duration=5m cl=ONE -schema 'replication(factor=1)' -pop seq=1..100000 -rate threads=20 limit=2000/s -node 127.0.0.1 Start scylla node 2 with low memory: scylla -c 1 -m 128M --auto-bootstrap true Without this patch, I saw std::bad_alloc during streaming ERROR 2016-06-01 14:31:00,196 [shard 0] storage_proxy - exception during mutation write to 127.0.0.1: std::bad_alloc (std::bad_alloc) ... ERROR 2016-06-01 14:31:10,172 [shard 0] database - failed to move memtable to cache: std::bad_alloc (std::bad_alloc) ... To fix: 1. Apply the streaming mutation limiter before we read the mutation into memory to avoid wasting memory holding the mutation which we can not send. 2. Reduce the parallelism of sending streaming mutations. Before we send each range in parallel, after we send each range one by one. before: nr_vnode * nr_shard * (send_info + cf.make_reader memory usage) after: nr_shard * (send_info + cf.make_reader memory usage) We can at least save memory usage by the factor of nr_vnode, 256 by default. In my setup, fix 1) alone is not enough, with both fix 1) and 2), I saw no std::bad_alloc. Also, I did not see streaming bandwidth dropped due to 2). In addition, I tested grow_cluster_test.py:GrowClusterTest.test_grow_3_to_4, as described: https://github.com/scylladb/scylla/issues/1270#issuecomment-222585375 With this patch, I saw no std::bad_alloc any more. Fixes: #1270 Message-Id: <7703cf7a9db40e53a87f0f7b5acbb03fff2daf43.1464785542.git.asias@scylladb.com> (cherry picked from commit `206955e47c`)	2016-06-02 11:18:59 +03:00
Gleb Natapov	15ad4c9033	storage_proxy: drop debug output Message-Id: <20160601132641.GK2381@scylladb.com> (cherry picked from commit `26b50eb8f4`)	2016-06-01 17:14:32 +03:00
Pekka Enberg	d094329b6e	Revert "Revert "main: change order between storage service and drain execution during exit"" This reverts commit `b3ed55be1d`. The issue is in the failing dtest, not this commit. Gleb writes: "The bug is in the test, not the patch. Test waits for repair session to end one way or the other when node is killed, but for nodetool to know if repair is completed it needs to poll for it. If node dies before nodetool managed to see repair completion it will stuck forever since jmx is alive, but does not provide answers any more. The patch changes timing, repair is completed much close to exit now, so problem appears, but it may happen even without the patch. The fix is for dtest to kill jmx as part of killing a node operation." Now that Lucas fixed the problem in scylla-ccm, revert the revert. (cherry picked from commit `0255318bf3`) scylla-1.2-rc1	2016-06-01 08:51:51 +03:00
Pekka Enberg	dcab915f21	release: prepare for 1.2.rc1	2016-05-30 13:14:38 +03:00
Pekka Enberg	b3ed55be1d	Revert "main: change order between storage service and drain execution during exit" This reverts commit `0ebd8b18b7`. The change breaks repair_additional_test.py:RepairAdditionalTest.repair_kill_1_test	2016-05-30 12:48:09 +03:00
Avi Kivity	e515933c70	dist: tune scheduler for lower latency Scylla-jmx and collectd can preempt scylla and induce long latencies. Tune the scheduler to provide lower latencies. Since when the support processes are not running we normally do not context switch (one thread per core, remember?), there should be no effect on throughput. The tunings are provided in a separate package, which can be uninstalled if the server is shared with other applications which are negatively affected by the tuning. Fixes #1218. Message-Id: <1464529625-12825-1-git-send-email-avi@scylladb.com>	2016-05-30 08:42:19 +03:00
Avi Kivity	e8e00338d1	config: document defragment_memory_on_idle Message-Id: <1464261650-14136-2-git-send-email-avi@scylladb.com>	2016-05-30 08:39:26 +03:00
Avi Kivity	b50cb3eca8	config: rename compact_on_idle compact_on_idle will lead users to thinking we're talking about sstable compaction, not log-structured-allocator compaction. Rename the variable to reduce the probability of confusion. Message-Id: <1464261650-14136-1-git-send-email-avi@scylladb.com>	2016-05-30 08:39:13 +03:00
Yoav Kleinberger	e580ac5dae	docker: fix Ubuntu Dockerfile one needs to update the repository info before one can install packages. Fixes issue #1296. Signed-off-by: Yoav Kleinberger <yoav@scylladb.com> Message-Id: <a906e76d584baff5988cb31a4003de27455e0741.1464529740.git.yoav@scylladb.com>	2016-05-29 17:00:25 +03:00
Avi Kivity	3f6ecb9f28	Merge "cancel cross DC read repair if non matching data was recently modified" from Gleb	2016-05-29 15:58:55 +03:00
Gleb Natapov	2efbccc901	storage_proxy: do only local read repair if non matching data was recently modified When read/write to a partition happens in parallel reader may detect digest mismatch that may potentially cause cross DC read repair attempt, but the repair is not really needed, so added latency is not justified. This patch tries to prevent such parallel access from causing heavy cross DC repair operation buy checking a timestamp of most resent modification. If the modification happens less then "write timeout" seconds ago the patch assumes that the read operation raced with write one and cancel cross DC repair, but only if CL is LOCAL_*.	2016-05-29 15:26:51 +03:00
Amnon Heiman	d4123ba613	API: column_family count sstable space used correctly The space calculation counters in column family had two problem: 1. The total bytes is an ever growing counter, which is meaningless for the API. 2. Trying to simply sum the size on all shards, ignores the fact that the same sstable file can be referenced by multiple shards, this is especially noticeable during migration time. To solve this, the implementation was modified so instead of collecting the sizes, the API would collect a map of file name to size and then would do the summing. This removes the duplications and fixes the total bytes calculation Calling cfstats before the change with load after a compaction happend: $ nodetool cfstats keyspace1 Keyspace: keyspace1 Verify write latency 1068253.0 76435 Read Count: 75915 Read Latency: 0.5953986037015082 ms. Write Count: 76435 Write Latency: 0.013975966507490025 ms. Pending Flushes: 0 Table: standard1 SSTable count: 5 Space used (live): 44261215 Space used (total): 219724478 After the fix: $ nodetool cfstats keyspace1 Keyspace: keyspace1 Verify write latency 1863206.0 124219 Read Count: 125401 Read Latency: 0.9381053978835895 ms. Write Count: 124219 Write Latency: 0.01499936402643718 ms. Pending Flushes: 0 Table: standard1 SSTable count: 6 Space used (live): 50402904 Space used (total): 50402904 Space used by snapshots (total): 0 Fixes: #1042 Signed-off-by: Amnon Heiman <amnon@scylladb.com> Message-Id: <1464518757-14666-2-git-send-email-amnon@scylladb.com>	2016-05-29 14:11:03 +03:00
Gleb Natapov	32c9a06faf	messaging_service: abort retrying send during exit Fixes #862 Message-Id: <1463579574-15789-3-git-send-email-gleb@scylladb.com>	2016-05-29 11:39:36 +03:00
Gleb Natapov	0ebd8b18b7	main: change order between storage service and drain execution during exit Even the comment says drain_on_shutdown should be called first, but for that in has to be registered last. Fixes #862 Message-Id: <1463579574-15789-2-git-send-email-gleb@scylladb.com>	2016-05-29 11:39:24 +03:00
Glauber Costa	30d54cef38	database: add a comment explaining the choice of function in CF stop We have recently commited a fix to a broken streaming bug that involved reverting column_family::stop() back to calling the custom seal functions explicitly for both memtables and streaming memtables. We here add a comment to explain why that had to be done. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <fe94b5883e9c29adc7fc9ee9f498894c057e7b64.1464293167.git.glauber@scylladb.com>	2016-05-29 11:28:15 +03:00
Avi Kivity	8e124b31aa	Merge "gossip: Refactor waiting for supported features" from Duarte "This patch changes the way we wait for supported features. We no longer sleep periodically, waking up to check if the wanted features are now avaiable. Instead, we register waiters in a condition variable that is signaled whenever new endpoint information is received. We also add a new poll interface based on the feature class, which encapsulates the availability of a cluster feature."	2016-05-27 20:24:25 +03:00
Duarte Nunes	f613dabf53	gossip: Introduce the gms::feature class This class encapsulates the waiting for a cluster feature. A feature object is registered with the gossiper, which is responsible for later marking it as enabled. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-05-27 17:20:51 +00:00
Duarte Nunes	4684b8ecbb	gossip: Refactor waiting for features This patch changes the sleep-based mechanism of detecting new features by instead registering waiters with a condition variable that is signaled whenever a new endpoint information is received. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-05-27 17:20:51 +00:00
Duarte Nunes	422f244172	gossip: Don't timeout when waiting for features This patch removes the timeout when waiting for features, since future patches will make this argument unnecessary. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-05-27 17:20:51 +00:00
Avi Kivity	fab4cc8d6d	Merge seastar upstream * seastar 8bfbb1a...0bcdd28 (1): > Merge "introduce sleep_abortable() that throws exception on application exit" from Gleb	2016-05-27 20:14:49 +03:00
Duarte Nunes	b3011c9039	gossip: Rename set_heart_beat_state ...to set_heart_beat_state_and_update_timestamp in order to make it explicit for callers the update_timestamp is also changed. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <1464309023-3254-3-git-send-email-duarte@scylladb.com>	2016-05-27 09:11:39 +03:00
Duarte Nunes	8c0e2e05b7	gossip: Fix modification to shadow endpoint state This patch fixes an inadvertent change to the shadow endpoint state map in gossiper::run, done by calling get_heart_beat_state() which also updates the endpoint state's timestamp. This did not happen for the normal map, but did happen for the shadow map. As a result, every time gossiper::run() was scheduled, endpoint_map_changed would always be true and all the shards would make superfluous copies of the endpoint state maps. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <1464309023-3254-2-git-send-email-duarte@scylladb.com>	2016-05-27 09:10:38 +03:00
Pekka Enberg	b7e79b72d5	Merge "Introduce SET_NIC for non-AMI environment" from Takuya "This patchset provides a way to enable SET_NIC(posix_net_conf.sh) on non-AMI environment. Also support -mq option of the script. This also contains number of bug fixes of scripts. Fixes #1192"	2016-05-26 13:37:06 +03:00
Yoav Kleinberger	26c0d86401	tools/scyllatop: improved user interface: scrollable views NOTE: scyllatop now requires the urwid library previously, if there were more metrics that lines in the terminal window, the user could not see some of the metrics. Now the user can scroll. As an added bonus, the program will not crash when the window size changes. Signed-off-by: Yoav Kleinberger <yoav@scylladb.com> Message-Id: <1464098832-5755-1-git-send-email-yoav@scylladb.com>	2016-05-26 13:36:28 +03:00
Piotr Jastrzebski	136b8148d2	Use idle CPU to compact LSA memory Register an idle CPU handler that compacts a single segment every time there's nothing better to execute on CPU. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <c26aa608a1e0752fb9e6db1833ef3ba1de95f161.1464169748.git.piotr@scylladb.com>	2016-05-26 12:43:53 +03:00
Avi Kivity	d7f36a093f	Merge seastar upstream * seastar e5faea8...8bfbb1a (1): > reactor: advertise the logging_failures metric as a DERIVE counter Fixes #1292.	2016-05-26 11:46:08 +03:00
Tomasz Grabiec	f0c2b1d161	config: Fix typos Message-Id: <1464201938-4778-1-git-send-email-tgrabiec@scylladb.com>	2016-05-26 08:19:57 +03:00
Asias He	f1b3cb4a08	storage_service: Catch and fail an invalid configuration with --replace-address Vlad reported a strange user configuration: SCYLLA_ARGS="--log-to-syslog 1 --log-to-stdout 0 --default-log-level info --collectd-address=127.0.0.1:25826 --collectd=1 --collectd-poll-period 60000 --network-stack posix --num-io-queues 32 --max-io-requests 128 --replace-address 10.0.4.131" seed_provider: - class_name: org.apache.cassandra.locator.SimpleSeedProvider parameters: - seeds: "10.0.4.131" In the mean while, 10.0.4.131 is the IP address of the node itself. When the node was started, the following message were reported. Apr 13 06:31:12 n0 scylla[19681]: [shard 0] gossip - Connect seeds again ... (20 seconds passed) Apr 13 06:31:13 n0 scylla[19681]: [shard 0] gossip - Connect seeds again ... (21 seconds passed) Apr 13 06:31:14 n0 scylla[19681]: [shard 0] gossip - Connect seeds again ... (22 seconds passed) Apr 13 06:31:15 n0 scylla[19681]: [shard 0] gossip - Connect seeds again ... (23 seconds passed) The configruation is invalid, becasue for --replace-address to work, at least one working seed node should be alive. Catch the configuration error and fail it with an appropriate error message. Fixes #1183 Message-Id: <a94a082d896313e7a668915ae21fe2c03719da3a.1464164058.git.asias@scylladb.com>	2016-05-25 14:42:19 +03:00
Asias He	fed1e65e1e	gossip: Do not insert the same node into _live_endpoints_just_added _live_endpoints_just_added tracks the peer node which just becomes live. When a down node gets back, the peer nodes can receive multiple messages which would mark the node up, e.g., the message piled up in the sender's tcp stack, after a node was blocked with gdb and released. Each such message will trigger a echo message and when the reply of the echo message is received (real_mark_alive), the same node will be added to _live_endpoints_just_added.push_back more than once. Thus, we see the same node be favored more than once: INFO 2016-04-12 12:09:57,399 [shard 0] gossip - do_gossip_to_live_member: Favor newly added node 127.0.0.2 INFO 2016-04-12 12:09:58,412 [shard 0] gossip - do_gossip_to_live_member: Favor newly added node 127.0.0.2 INFO 2016-04-12 12:09:59,429 [shard 0] gossip - do_gossip_to_live_member: Favor newly added node 127.0.0.2 INFO 2016-04-12 12:10:00,429 [shard 0] gossip - do_gossip_to_live_member: Favor newly added node 127.0.0.2 INFO 2016-04-12 12:10:01,430 [shard 0] gossip - do_gossip_to_live_member: Favor newly added node 127.0.0.2 INFO 2016-04-12 12:10:02,442 [shard 0] gossip - do_gossip_to_live_member: Favor newly added node 127.0.0.2 INFO 2016-04-12 12:10:03,454 [shard 0] gossip - do_gossip_to_live_member: Favor newly added node 127.0.0.2 To fix, do not insert the node if it is already in _live_endpoints_just_added. Fixes #1178 Message-Id: <6bcfad4430fbc63b4a8c40ec86a2744bdfafb40f.1464161975.git.asias@scylladb.com>	2016-05-25 14:19:40 +03:00
Glauber Costa	46f60f52d9	database: do not use implicitly stated seal function when closing the CF In commit `4981362f57`, I have introduced a regression that was thankfully caught by our dtest infrastructure. That patch is a preparation patch for the active reclaim patchset that is to come, and it consolidated all the flushes using the memtable_list's seal_fn function instead of calling the seal function explicitly. The problem here is that the streaming memtables have the delayed mechanism, about which the memtable_list is unaware. Calling memtable_list's seal_active_memtable() for the streaming memtables calls the delayed version, that does not guarantee flush. If we're lucky, we will indeed flush after the timer expires, but if we're not we'll just stop the CF with data not flushed. There are two options to fix this: the first is to teach the memtable_list about the delayed/forced mechanism, and the second is to just call the correct function explicitly during shutdown, and then when the time comes to add continuations to the result of the seal, add them here as well. Although the second option involves a bit more work and duplication, I think it is better in the sense that the delayed / forced mechanism really is something that belong to the streaming only. Being this the only user, I don't think it justifies complicating the memtable_list with this concept. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <b26017c825ccf585f39f58c4ab3787d78e551f5f.1464126884.git.glauber@scylladb.com>	2016-05-25 08:21:24 +03:00
Avi Kivity	2d4d6c9c92	Merge seastar upstream * seastar aed893e...e5faea8 (5): > Catch exceptions thrown by idle cpu handler > core::gate: add a get_count() method > reactor: Introduce idle CPU handler > core: add missing header for g++-4.9 > Add lksctp-tools-devel do required packages	2016-05-24 20:42:41 +03:00
Pekka Enberg	ceb29f9d32	Merge "Introduce upload dir for sstable migration" from Raphael "This change is intended to make migration process safer and easier. All column families will now have a directory called upload. With this feature, users may choose to copy migrated sstables to upload directory of respective column families, and run 'nodetool refresh'. That's supposed to be the preferred option from now on."	2016-05-24 16:36:47 +03:00
Gleb Natapov	7f6b12c97a	query: add user provided timestamp to read_command If read query supplies timestamp move it to read_command to be used later otherwise get local timestamp.	2016-05-24 15:19:35 +03:00
Pekka Enberg	d7d8c76fe5	transport/server: Add CQL frame LZ4 compression support The default CQL frame compression algorithm in Cassandra is LZ4. Add support for decompressing incoming frames and compressing outgoing frames with LZ4 if the CQL driver asks for that. Fixes #416 Message-Id: <1464086807-11325-1-git-send-email-penberg@scylladb.com>	2016-05-24 15:03:33 +03:00
Takuya ASADA	53cebb4a5e	dist/ubuntu: don't rebuild dependency packages by default Same as CentOS, do not build dependencies by default, install binary packages from our repository. Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1464023451-21436-1-git-send-email-syuu@scylladb.com>	2016-05-24 14:10:59 +03:00
Gleb Natapov	12cf60c302	messaging_service: add timestemp of last modification to READ_DIGEST verb return value	2016-05-24 13:27:34 +03:00
Gleb Natapov	1e6f64f4ab	query: add latest modification timestamp to result structure	2016-05-24 13:27:34 +03:00
Gleb Natapov	5fef0717cc	query: find latest modification timestamp while calculating result digest	2016-05-24 13:27:34 +03:00
Avi Kivity	9637c2232c	Merge "Move the JMX timer polling logic to Scylla" from Amnon	2016-05-24 13:07:52 +03:00
Raphael S. Carvalho	c2fa3b796d	db: fix read consistency after refresh If sstable loaded by refresh covers a row that is cached by the column family, read query may fail to return consistent data. What we should do is to clear cache for the column family being loaded with new sstables. Fixes #1212. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <a08c9885a5ceb0b2991e40337acf5b7679580a66.1464072720.git.raphaelsc@scylladb.com>	2016-05-24 12:11:41 +03:00
Takuya ASADA	5d5d525a14	dist/ubuntu: fix incorrect dependency package name PyYAML is CentOS/RHEL/Fedora package name, python-yaml is correct one. Fixes #1279 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1463987823-22837-1-git-send-email-syuu@scylladb.com>	2016-05-23 10:21:29 +03:00
Pekka Enberg	8a7197e390	dist/docker: Fetch RPM repository from Scylla web site Fix the hard-coded Scylla RPM repository by downloading it from Scylla web site. This makes it easier to switch between different versions. Message-Id: <1463981271-25231-1-git-send-email-penberg@scylladb.com>	2016-05-23 09:45:41 +03:00

1 2 3 4 5 ...

9427 Commits