scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-01 12:36:56 +00:00

Author	SHA1	Message	Date
Tomasz Grabiec	f15c380a4f	database: Compact mutations when executing data queries Currently data query digest includes cells and tombstones which may have expired or be covered by higher-level tombstones. This causes digest mismatch between replicas if some elements are compacted on one of the nodes and not on others. This mismatch triggers read-repair which doesn't resolve because mutations received by mutation queries are not differing, they are compacted already. The fix adds compacting step before writing and digesting query results by reusing the algorithm used by mutation query. This is not the most optimal way to fix this. The compaction step could be folded with the query writing, there is redundancy in both steps. However such change carries more risk, and thus was postponed. perf_simple_query test (cassandra-stress-like partitions) shows regression from 83k to 77k (7%) ops/s. Fixes #1165.	2016-04-07 19:56:58 +02:00
Tomasz Grabiec	e4e8acc946	mutation_query: Extract main part of mutation_query() into more generic querying_reader So that it can be reused in query()	2016-04-07 19:03:04 +02:00
Takuya ASADA	ed7a3beed2	dist/ubuntu: drop unused scripts This was used when we didn't shared scripts between CentOS/Fedora and Ubuntu, but used anymore so drop them. Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1459408583-13497-1-git-send-email-syuu@scylladb.com>	2016-04-06 08:21:09 +03:00
Asias He	d5dce8016b	storage_service: Advertise supported_features into cluster Advertise features supported by this node, so that other nodes can know this info. For example, on a 3 node cluster with supported_features == FEATURE1 and FEATURE2, it looks like: cqlsh> SELECT supported_features from system.peers; supported_features -------------------- FEATURE1,FEATURE2 FEATURE1,FEATURE2 (2 rows) cqlsh> SELECT supported_features from system.local; supported_features -------------------- FEATURE1,FEATURE2 (1 rows)	2016-04-06 07:12:34 +08:00
Asias He	0e1738943d	storage_service: Add supported_features into system.peers table	2016-04-06 07:12:34 +08:00
Asias He	50bcfe569a	system_keyspace: Add supported_features into system.local table	2016-04-06 07:12:34 +08:00
Asias He	b710a5f9ee	storage_service: Introduce get_config_supported_features It tells features supported by this local node. When new feature is introduced in scylla, update features returned by get_config_supported_features, e.g., return sstring("FEATURE1,FEATURE2")	2016-04-06 07:12:34 +08:00
Asias He	e0a82a1107	gossip: Add supported_features helper in versioned_value Give a supported features sstring, return a versioned_value for it.	2016-04-06 07:12:34 +08:00
Asias He	214c0f72b2	db: Add supported_features column in system.local and system.peers table	2016-04-06 07:12:34 +08:00
Asias He	04e8727793	gossip: Introduce wait_for_feature_on_{all}_node API to wait for features are available on a node or all the nodes in the cluster. $timeout specifies how long we want to wait. If the features are not availabe yet, sleep 2 seconds and retry.	2016-04-06 07:12:34 +08:00
Asias He	1e437e925c	gossip: Introduce get_supported_features - Get features supported by this particular node std::set<sstring> get_supported_features(inet_address endpoint) const; - Get features supported by all the nodes this node knows about std::set<sstring> get_supported_features() const;	2016-04-06 07:12:34 +08:00
Asias He	a6080773b3	gossip: Add SUPPORTED_FEATURES application_state It is used to negotiate cluster wide features.	2016-04-06 07:12:34 +08:00
Piotr Jastrzebski	d3f91eec61	Implement tuple_type_impl::from_string This is a fix for: https://github.com/scylladb/scylla/issues/574 It mirrors the behavior of: org.apache.cassandra.db.marshal.TupleType.java#fromString Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <24a7d6253727d0faebb1df117c2f52410523d42f.1459843091.git.piotr@scylladb.com>	2016-04-05 16:00:18 +03:00
Vlad Zolotarov	2daaa00c4f	conf: resurrect the important text related to endpoint_snitch configuration commit `d1b44cef1b` removed an important part of a comment related to an 'endpoint_snitch' configuration. This patch puts it back. Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com> Message-Id: <1459858934-12005-1-git-send-email-vladz@cloudius-systems.com>	2016-04-05 15:23:13 +03:00
Raphael S. Carvalho	e15ce5eb4d	api: Add support to get column family compression ratio After this change, user can query compression ratio on a per column family basis with 'nodetool cfstats'. look at 'nodetool cfstats' output: ./bin/nodetool cfstats ks.test5 Keyspace: ks Read Count: 0 Read Latency: NaN ms. Write Count: 0 Write Latency: NaN ms. Pending Flushes: 0 Table: test5 SSTable count: 1 Space used (live): 4774 Space used (total): 4774 Space used by snapshots (total): 0 Off heap memory used (total): 131384 SSTable Compression Ratio: 0.833333 ... Fixes #636. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <a1bee5a23fe63787df3e387a88f2d216ba4a4134.1459802771.git.raphaelsc@scylladb.com>	2016-04-05 12:46:40 +03:00
Asias He	d1b44cef1b	conf: Drop duplicated section for endpoint_snitch endpoint_snitch is supported and it is the "Supported Parameters". Remove the duplicated section in "Unsupported parameters". Message-Id: <f8260b72558305f9186c011b8f8f452b3b91339b.1459325982.git.asias@scylladb.com>	2016-04-05 08:48:48 +03:00
Pekka Enberg	32471fcb96	Merge "Do batch log replay in decommission" from Asias "batchlog_manager is modified to allow the storage_service to initate a bachlog replay operation. Refs #1085. Tested with tests/batchlog_manager_test and batch_test.py"	2016-04-05 08:42:47 +03:00
Gleb Natapov	70575699e4	commitlog, sstables: enlarge XFS extent allocation for large files With big rows I see contention in XFS allocations which cause reactor thread to sleep. Commitlog is a main offender, so enlarge extent to commitlog segment size for big files (commitlog and sstable Data files). Message-Id: <20160404110952.GP20957@scylladb.com>	2016-04-04 14:15:00 +03:00
Amnon Heiman	725231a7a0	api: set the api_doc before registering any api This is a left over from the re ordering of the API init. The api_doc should be set first, so later API registration will enable their relevent swagger doc. Currently, the swagger documentation of the system API is not available. Fixes #1160 Signed-off-by: Amnon Heiman <amnon@scylladb.com> Message-Id: <1459750490-15996-1-git-send-email-amnon@scylladb.com>	2016-04-04 11:37:59 +03:00
Avi Kivity	6a3cf4ac41	cql: unlock ALTER TABLE syntax It was marked experimental for 1.0, but will be fully supported in the next release. Message-Id: <1459707946-5860-1-git-send-email-avi@scylladb.com>	2016-04-04 11:36:11 +03:00
Piotr Jastrzebski	613e7d8618	Add more info to wrong RPC address error If listening on RPC address failed then report IP address and port in the error message. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <c4db3527df2ce6dccb3b619584ee3fcb1e70ffd1.1459512258.git.piotr@scylladb.com>	2016-04-03 12:57:19 +03:00
Takuya ASADA	cad5edc53b	dist: fix build error at copy symlinks Both build_rpm.sh and build_deb.sh will fail with "cannot stat 'xxx': No such file or directory" when scylla-server package is not installed, need to prevent it by --no-dereference option of cp. Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1459523585-9108-1-git-send-email-syuu@scylladb.com>	2016-04-03 12:49:55 +03:00
Tomasz Grabiec	0fc4c36952	tests: sstable_mutation_test: Compare keys not representations Representation is opaque at this level of abstraction. Reviewed-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <1459508193-7086-1-git-send-email-tgrabiec@scylladb.com>	2016-04-03 11:39:03 +03:00
Nadav Har'El	6c4ee49bd3	sstables: another test for range tombstone merging This is another unit test for range tombstone merging, introduced in commit `0fc9a5ee4d` and rewritten in commit `99ecda3c96`. In this test, a single large deletion was broken up into several smaller ranges, all with the same time stamps, so we should recombine them into one row tombstone, instead of failing the read. The sstable in this test case was artificially created using json2sstable. We don't know how yet to produce such a case using Cassandra 2, but we have seen a similar occurance in the wild, in a real SSTable. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <1459429243-15821-1-git-send-email-nyh@scylladb.com>	2016-04-01 11:55:14 +02:00
Takuya ASADA	d59c1c7648	dist/redhat: drop very old %pre script These lines are needed for very old version of scylla, not for 1.0. Can be removed now. Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1459177601-20269-1-git-send-email-syuu@scylladb.com>	2016-04-01 09:41:18 +03:00
Pekka Enberg	b9a1aef670	Merge "Random exception safety fixes" from Paweł "These patches fix some of the problems found by randomly injecting memory allocation failures."	2016-04-01 08:58:00 +03:00
Paweł Dziepak	8f78b8e190	log: ignore logging exceptions Logging is used in many places including those that shouldn't really throw any exceptions (destructors, noexcept functions). Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-03-31 16:43:32 +01:00
Paweł Dziepak	c8159eca52	commitlog: make sure that segment destructor doesn't throw Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-03-31 16:42:56 +01:00
Paweł Dziepak	3e0555809e	storage_proxy: catch all exceptions in read executor abstract_read_executor::reconcile() is supposed to make sure that _result_promise is eventually set to either a result or an exception. That may not happen however if reconciliation throws any exception since only read timeouts are being caught. When that happends the continuation chain becomes stuck. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-03-31 16:38:41 +01:00
Paweł Dziepak	3c107c4b05	sstables: remove HyperLogLog throw() specifier HyperLogLog constructor promises that it only throws instances of std::invalid_argument. That's a lie since it also adds elements to a vector (and doesn't catch potential bad_allocs). Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-03-31 16:36:53 +01:00
Avi Kivity	417bcb122d	commitlog: ignore commitlog segments generated by Cassandra-derived tools Cassandra-derived tools (such as sstable2json) may write commitlog segments, that Scylla cannot recognize. Since we now write them with a distinct name, we can recognize the name and ignore these segments, as we know the data they contain is not interesting. Fixes #1112. Message-Id: <1459356904-20699-1-git-send-email-avi@scylladb.com>	2016-03-31 16:01:08 +03:00
Nadav Har'El	78c9f49585	sstables: Move check_marker() to source file The check_marker() function is use as a sanity-check of data we read from sstable, so instead of the header file key.hh, let's move it to the sstable-parsing source file partition.cc. In addition to having less code in header files, another benefit is that the function can now throw a more specific exception (malformed sstable exception). Also fixed the exception's message (which had a second "%d" but only one parameter). Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <1459420430-5968-1-git-send-email-nyh@scylladb.com>	2016-03-31 14:22:51 +03:00
Nadav Har'El	99ecda3c96	sstables: overhaul range tombstone reading Until recently, we believed that range tombstones we read from sstables will always be for entire rows (or more generalized clustering-key prefixes), not for arbitrary ranges. But as we found out, because Cassandra insists that range tombstones do not overlap, it may take two overlapping row tombstones and convert them into three range tombstones which look like general ranges (see the patch for a more detailed example). Not only do we need to accept such "split" range tombstones, we also need to convert them back to our internal representation which, in the above example, involves two overlapping tombstones. This is what this patch does. This patch also contains a test for this case: We created in Cassandra an sstable with two overlapping deletions, and verify that when we read it to Scylla, we get these two overlapping deletions - despite the sstable file actually having contained three non-overlapping tombstones. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <b7c07466074bf0db6457323af8622bb5210bb86a.1459399004.git.glauber@scylladb.com>	2016-03-31 12:49:50 +03:00
Pekka Enberg	2629389d5d	dist/docker/ubuntu: Use bash in start-scylla script The default shell in Ubuntu is "dash" which causes the following error when "scylla-start" script is executed: /start-scylla: 8: /start-scylla: source: not found Message-Id: <1459406561-20141-1-git-send-email-penberg@scylladb.com>	2016-03-31 11:21:36 +03:00
Duarte Nunes	26a3461908	cql: Fix antlr3 missing token leak This patch overrides the antlr3 function that allocates the missing tokens that would eventually leak. The override stores these tokens in a vector, ensuring memory is freed whenever the parser is destroyed. Fixes #1147 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <1459355146-17402-1-git-send-email-duarte@scylladb.com>	2016-03-31 08:44:45 +03:00
yan cui	6fc29843cd	dist/docker: refine docker file for ubuntu	2016-03-30 18:54:14 +03:00
Duarte Nunes	f7a12adb6f	cql3: Disable pg-style string format test antlr3 leaks the token itself creates when recovering from a mismatch in the case the missing token can be determined. Until this bug is fixed or circumvented, the test should remain disabled. Ref #1147 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <1459345403-8243-1-git-send-email-duarte@scylladb.com>	2016-03-30 16:44:47 +03:00
Asias He	bc1889b7ab	storage_service: Shutdown batchlog_manager after decommission On the node which was decommissioned, I saw 2016-03-29 09:35:52,097 [shard 0] storage_service - DECOMMISSIONED: 2016-03-29 09:35:52,097 [shard 0] storage_service - DECOMMISSIONING: done 2016-03-29 09:36:28,814 [shard 0] batchlog_manager - Batchlog replay on shard 0: starts 2016-03-29 09:36:28,814 [shard 0] batchlog_manager - Batchlog replay on shard 0: done 2016-03-29 09:37:28,819 [shard 0] batchlog_manager - Batchlog replay on shard 1: starts 2016-03-29 09:37:28,820 [shard 0] batchlog_manager - Batchlog replay on shard 1: done 2016-03-29 09:38:28,830 [shard 0] batchlog_manager - Batchlog replay on shard 0: starts 2016-03-29 09:38:28,830 [shard 0] batchlog_manager - Batchlog replay on shard 0: done 2016-03-29 09:39:28,844 [shard 0] batchlog_manager - Batchlog replay on shard 1: starts 2016-03-29 09:39:28,844 [shard 0] batchlog_manager - Batchlog replay on shard 1: done We should stop the batchlog_manager to avoid initiating only future batchlog replay operation.	2016-03-30 20:54:30 +08:00
Asias He	5d1140b1eb	storage_service: Do batch log replay in decommission Replay the batch log during decommission. Kill one FIXME. Refs #1085	2016-03-30 20:54:30 +08:00
Asias He	5550aeba1d	batchlog_manager: Avoid stopping batchlog_manager more than once We can stop batchlog_manager in decommission and drain. Avoid stopping it more than once. Fix the following error: $ nodetool decommission $ nodetool drain storage_service - DECOMMISSIONING: stop_gossiping done storage_service - messaging_service stopped storage_service - DECOMMISSIONING: stop messaging_service done storage_service - DECOMMISSIONING: set_bootstrap_state done storage_service - DECOMMISSIONED: storage_service - DECOMMISSIONING: done storage_service - DRAINING: starting drain process gossip - gossip is already stopped scylla: ./seastar/core/gate.hh:93: future<> seastar::gate::close(): Assertion `!_stopped && "seastar::gate::close() cannot be called more than once"' failed.	2016-03-30 20:54:30 +08:00
Asias He	cdb43c5586	batchlog_manager: Allow user initiated bachlog replay operation During decommission, the storage_service::unbootstrap() needs to initiate a batchlog replay operation. To sync the replay operation initiated by the timer in batchlog_manager and storage_service, a semaphore is introduced. To simplify the semaphore locking, the management code now always runs on shard zero, but the real work is distruted to all shards.	2016-03-30 20:54:30 +08:00
Nadav Har'El	0fc9a5ee4d	sstables: merge range tombstones if possible This is a rewrite of Glauber's earlier patch to do the same thing, taking into account Avi's comments (do not use a class, do not throw from the constructor, etc.). I also verified that the actual use case which was broken in #1136 was fixed by this patch. Currently, we have no support for range tombstones because CQL will not generate them as of version 2.x. Thrift will, but we can safely leave this for the future. However, we have seen cases during a real migration in which a pure-CQL Cassandra would generate range tombstones in its SSTables. Although we are not sure how and why, those range tombstones were of a special kind: their end and next's start range were adjacent, which means that in reality, they could very well have been written as a single range tombstone for an entire clustering key - which we support just fine. This code will attempt to fix this problem temporarily by merging such ranges if possible. Care must be taken so that we don't end up accepting a true generic range tombstone by accident. Fixes #1136 Signed-off-by: Glauber Costa <glauber@scylladb.com> Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <1459333972-20345-1-git-send-email-nyh@scylladb.com>	2016-03-30 13:40:10 +03:00
Calle Wilund	0f5ca342b8	lists.cc: setter_by_uuid does not require read before execute Fixes #1082 Setting by UUID does not need existing data in list, so need no read before execute Message-Id: <1459325931-16387-1-git-send-email-calle@scylladb.com>	2016-03-30 11:24:20 +03:00
Takuya ASADA	73fa36b416	dist/common/scripts: update SET_NIC when --setup-nic passed to scylla_sysconfig_setup scylla_sysconfig_setup mistakenly ignores --setup-nic argument. Fixes #1132 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1459285500-22185-1-git-send-email-syuu@scylladb.com>	2016-03-30 11:07:33 +03:00
Takuya ASADA	58fb7000b1	dist: add setup scripts symlink to /usr/sbin Instead of moving script to /usr/sbin, create symlink from /usr/lib/scylla/scylla_*setup to /usr/sbin/ Fixes #1092 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1459324684-31364-1-git-send-email-syuu@scylladb.com>	2016-03-30 11:04:41 +03:00
Glauber Costa	23808ba184	sstables: fix exception printouts in check_marker As Nadav noticed in his bug report, check_marker is creating its error messages using characters instead of numbers - which is what we intended here in the first place. That happens because sprint(), when faced with an 8-byte type, interprets this as a character. To avoid that we'll use uint16_t types, taking care not to sign-extend them. The bug also noted that one of the error messages is missing a parameter, and that is also fixed. Fixes #1122 Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <74f825bbff8488ffeb1911e626db51eed88629b1.1459266115.git.glauber@scylladb.com>	2016-03-29 19:23:28 +03:00
Takuya ASADA	c1277bacb4	dist/common/scripts: prevent misinterpret blank input as '/dev/', show error when inputted device path is not found Fixes #1110 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1459267786-19123-1-git-send-email-syuu@scylladb.com>	2016-03-29 19:18:51 +03:00
Glauber Costa	d5c1366e85	compaction: be verbose about which table is causing an exception When we, for some reason, fail to compact an SSTable, we do not log the file name leaving us with cryptic messages that tell us what happened, but not where it happened. This patch adds logging in compaction so that we'll know what's going on. Please note that readers are more of a concern, because the SSTable being written technically do not exist yet. Still, better safe than sorry: if open_data fails, or we leave an unfinished SSTable, it is still good to know which one was the culprit. Some argument can be made about whether we should log this at the lower SSTable level, or at the compaction level. The reason I am logging this at the compaction level, is that we don't really know which exception will trigger, and where: it may be the case that we're seeing exceptions that are not SSTable specific, and may not have the chance to log it properly. In particular, if the exception happens inside the reader: read_rows() and friends only return a mutation reader, which doesn't really do anything until we call read(). But at that time, we don't hold any pointers to the SSTable anymore. In Summary, logging at the compaction level guarantees that we always do it no matter what. Exceptions that are part of the main SSTable path can log the file name as well if they want: if that's the case, we'll be left with the name appearing twice. That's totally harmless, and better than none. Fixes #1123 Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <c5c969fb6aeb788a037bd7a4ea69979c1042cb34.1459263847.git.glauber@scylladb.com>	2016-03-29 18:15:56 +03:00
Glauber Costa	d536846433	commitlog: initialize sync period with actual sync period commitlog's sync period is initialized as the batch period, and not as the sync period itself as it should be. I've found this by code inspection, but unless I am missing something really fundamental, this seems to be completely wrong. It's been working fine because in our defaults, I have checked that both variables default to the same value. But it seems to me that as long as anyone would change one of them, the behavior wouldn't be as expected. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <2e7c565242fe5d4481a3ee8b0ba425ef14f5e42a.1459252783.git.glauber@scylladb.com>	2016-03-29 15:21:02 +03:00
Takuya ASADA	a5bb6c4b1b	dist/ubuntu: drop classical sysv init script, only support Upstart for Ubuntu 14.04LTS Sysv init script was added just for prevent warning message on lintian, never really used by Ubuntu users. Result of that, we often break this script since upstart/systemd unit file frequently changed. It may confuse users, it's better to use Upstart only, just like Fedora/CentOS. Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1459177601-20269-2-git-send-email-syuu@scylladb.com>	2016-03-29 11:48:18 +03:00

... 52 53 54 55 56 ...

11716 Commits