scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-02 04:56:58 +00:00

Author	SHA1	Message	Date
Asias He	32eaaecf36	gossip: Get rid of assert Log the error and throw the exception, instead of abort the whole process. Make the code more robust.	2016-02-25 21:19:52 +08:00
Asias He	699fd25467	storage_service: Get rid of assert We can recover from most of the errors. Log the error and throw the exception, instead of abort the whole process. Make the code more robust.	2016-02-25 21:19:52 +08:00
Asias He	59564591d5	storage_service: Use get_gossip_status to get status The help is introduced recently, use it. Avoid to open code it.	2016-02-25 21:19:52 +08:00
Pekka Enberg	8e2c924de3	cql3: Fix quadratic behavior in update_statement::parsed_insert::prepare_internal() This fixes a quadratic search for duplicate columns in prepare_internal(). Refs #822. Message-Id: <1456405104-16482-1-git-send-email-penberg@scylladb.com>	2016-02-25 15:06:56 +02:00
Yoav Kleinberger	872079d999	tools/scyllatop: correct mistake in help text Signed-off-by: Yoav Kleinberger <yoav@scylladb.com> Message-Id: <01844d90f2d942a051d128b03ae12578ac0bb69c.1456324697.git.yoav@scylladb.com>	2016-02-25 12:49:48 +02:00
Asias He	94cb7f22d4	gossip: Make add_local_application_state safe to call on any cpu add_local_application_state is used in various places. Before this patch, it can only be called on cpu zero. To make it safer to use, use invoke_on() to foward the code to run on cpu zero, so that caller can call it on any cpu. Refs: #795 Message-Id: <d69b81c5561622078dbe887d87209c4ea2e3bf46.1456315043.git.asias@scylladb.com>	2016-02-25 12:45:54 +02:00
Asias He	4e931c2453	gossip: Log the error when fails to add local application state Gleb saw once: scylla: gms/gossiper.cc:1393: gms::gossiper::add_local_application_state(gms::application_state, gms::versioned_value):: mutable: Assertion `endpoint_state_map.count(ep_addr)' failed. The assert is about we can not find the entry in endpoint_state_map of the node itself. I can not really find any place we could call add_local_application_state before we call gossiper::start_gossiping() where it inserts broadcast address into endpoint_state_map. I can not reproduce issue, let's log the error so we can narrow down which application state triggered the assert. Refs: #795 Message-Id: <f4433be0a0d4f23470a5e24e528afdb67b74c7ef.1456315043.git.asias@scylladb.com>	2016-02-25 12:45:17 +02:00
Takuya ASADA	b250a3b116	dist: Add collectd configuration support on .rpm/.deb Depends on collectd, add /etc/collectd.d/scylla.conf on scylla-server package installation. Fixes #946 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1456336200-11876-1-git-send-email-syuu@scylladb.com>	2016-02-25 10:35:47 +02:00
Takuya ASADA	28dd202613	scyllatop: add --logfile argument to specify path to log file Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1456333116-7389-2-git-send-email-syuu@scylladb.com>	2016-02-25 10:33:41 +02:00
Takuya ASADA	af3a8ead21	scyllatop: output error message both on log file and stdout Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1456333116-7389-1-git-send-email-syuu@scylladb.com>	2016-02-25 10:33:40 +02:00
Calle Wilund	9586793c70	database: Fix use and assumptions about pending compations Fixes #934 - faulty assert in discard_sstables run_with_compaction_disabled clears out a CF from compaction mananger queue. discard_sstables wants to assert on this, but looks at the wrong counters. pending_compactions is an indicator on how much interested parties want a CF compacted (again and again). It should not be considered an indicator of compactions actually being done. This modifies the usage slightly so that: 1.) The counter is always incremented, even if compaction is disallowed. The counters value on end of run_with_compaction_disabled is then instead used as an indicator as to whether a compaction should be re-triggered. (If compactions finished, it will be zero) 2.) Document the use and purpose of the pending counter, and add method to re-add CF to compaction for r_w_c_d above. 3.) discard_sstables now asserts on the right things. Message-Id: <1456332824-23349-1-git-send-email-calle@scylladb.com>	2016-02-25 08:57:04 +02:00
Calle Wilund	590ec1674b	truncate: Require timestamp join-function to ensure equal values Fixes #937 In fixing #884, truncation not truncating memtables properly, time stamping in truncate was made shard-local. This however breaks the snapshot logic, since for all shards in a truncate, the sstables should snapshot to the same location. This patch adds a required function argument to truncate (and by extension drop_column_family) that produces a time stamp in a "join" fashion (i.e. same on all shards), and utilizes the joinpoint type in caller to do so. Message-Id: <1456332856-23395-2-git-send-email-calle@scylladb.com>	2016-02-24 18:59:31 +02:00
Calle Wilund	43ea1f5945	utils::jointpoint: Helper type to generate a singular value for all shards Lets operations working on all shards "join" and acquire the same value of something, with that value being based on whenever all shards reach the join. Obvious use case: time stamp after one set of per-shard ops, but before final ones. The generation of the value is guaranteed to happen on the shards that created the join point. Based on the join-ops in CF::snapshot, but abstracted and made caller responsibility. Primary use case is to help deal with the join-problem of truncation. Message-Id: <1456332856-23395-1-git-send-email-calle@scylladb.com>	2016-02-24 18:59:25 +02:00
Yoav Kleinberger	c3ce9e53cb	tools/scyllatop: support glob patterns to specifiy metrics Signed-off-by: Yoav Kleinberger <yoav@scylladb.com> Message-Id: <42f84cdeeb75c3719230028a13a1dd8499673d4c.1456319441.git.yoav@scylladb.com>	2016-02-24 15:35:45 +02:00
Raphael S. Carvalho	bb48f1b06c	sstables: use system clock's epoch for timestamp in compaction history As pointed out by Tomek, the type of column used is timestamp, therefore system's clock epoch (db_clock) should be used instead. Fixes #817. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <f80f9f411d673cf2d653e193ccb8ebaa36bc891b.1456317766.git.raphaelsc@scylladb.com>	2016-02-24 14:49:21 +02:00
Pekka Enberg	dfcc48d82a	transport: Add result metadata to PREPARED message The gocql driver assumes that there's a result metadata section in the PREPARED message. Technically, Scylla is not at fault here as the CQL specification explicitly states in Section 4.2.5.4. ("Prepared") that the section may be empty: - <result_metadata> is defined exactly as <metadata> but correspond to the metadata for the resultSet that execute this query will yield. Note that <result_metadata> may be empty (have the No_metadata flag and 0 columns, See section 4.2.5.2) and will be for any query that is not a Select. There is in fact never a guarantee that this will non-empty so client should protect themselves accordingly. The presence of this information is an However, Cassandra always populates the section so lets do that as well. Fixes #912. Message-Id: <1456317082-31688-1-git-send-email-penberg@scylladb.com>	2016-02-24 14:43:24 +02:00
Avi Kivity	fedba9d6cd	Merge "reduce gossip round latency" from Asias "This series makes gossip message handling to be async to reduce gossip round latency. Commit log of patch 3 explains the issue in detail. Refs: #900"	2016-02-24 13:44:06 +02:00
Avi Kivity	b42a32efc7	Update scylla-ami submodule * dist/ami/files/scylla-ami 398b1aa...d4a0e18 (3): > Sort service running order (scylla-ami-setup.service -> scylla-io-setup.service -> scylla-server.service) > Drop --ami and --disk-count parameters > dist: pass the number of disks to set io params	2016-02-24 13:38:05 +02:00
Avi Kivity	cda29c0324	Merge seastar upstream * seastar 8c560f2...769cb8b (4): > temporary_buffer: make operator bool explicit (and const) > iotune: use SEASTAR_IO instead of SCYLLA_IO > iotune: add --format option, to use EnvironmentFile on systemd > sstring: add data() methods	2016-02-24 13:38:05 +02:00
Avi Kivity	efabb1a1d8	commitlog: fix buffer size calculation We were adding bool(buffer), instead of buffer.size(); exposed by making temporary_buffer::operator bool explicit.	2016-02-24 13:38:05 +02:00
Asias He	697b16414a	gossip: Make gossip message handling async In each gossip round, i.e., gossiper::run(), we do: 1) send syn message 2) peer node: receive syn message, send back ack message 3) process ack message in handle_ack_msg apply_state_locally mark_alive send_gossip_echo handle_major_state_change on_restart mark_alive send_gossip_echo mark_dead on_dead on_join apply_new_states do_on_change_notifications on_change 4) send back ack2 message 5) peer node: process ack2 message apply_state_locally At the moment, syn is "wait" message, it times out in 3 seconds. In step 3, all the registered gossip callbacks are called which might take significant amount of time to complete. In order to reduce the gossip round latency, we make syn "no-wait" and do not run the handle_ack_msg insdie the gossip::run(). As a result, we will not get a ack message as the return value of a syn message any more, so a GOSSIP_DIGEST_ACK message verb is introduced. With this patch, the gossip message exchange is now async. It is useful when some nodes are down in the cluster. We will not delay the gossip round, which is supposed to run every second, 3*n seconds (n = 1-3, since it talks to 1-3 peer nodes in each gossip round) or even longer (considering the time to run gossip callbacks). Later, we can make talking to the 1-3 peer nodes in parallel to reduce latency even more. Refs: #900	2016-02-24 19:33:39 +08:00
Asias He	63df54b368	messaging_service: Add GOSSIP_DIGEST_ACK We will soon switch to use no-wait message for gossip. GOSSIP_DIGEST_SYN will no longer return GOSSIP_DIGEST_ACK message. So we need a standalone verb for GOSSIP_DIGEST_ACK.	2016-02-24 19:31:14 +08:00
Asias He	022c7e50a1	failure_detector: Fix false alarm of "Not marking nodes down due to local pause of" The problem is we initialize _last_interpret when failure_detector object is constructed. When interpret() runs for the first time, the _last_interpret value is not the last time we run interpret() but the time we initialize failure_detector object. Fix by initializing _last_interpret inside interpret(). [Thu Feb 18 02:40:04 2016] INFO [shard 0] storage_service - Node 127.0.0.1 state jump to normal [Thu Feb 18 02:40:04 2016] INFO [shard 0] storage_service - NORMAL: node is now in normal status [Thu Feb 18 02:40:04 2016] INFO [shard 0] gossip - Waiting for gossip to settle before accepting client requests... [Thu Feb 18 02:40:12 2016] INFO [shard 0] gossip - No gossip backlog; proceeding Starting listening for CQL clients on 127.0.0.1:9042... [Thu Feb 18 02:40:12 2016] INFO [shard 0] gossip - Node 127.0.0.2 is now part of the cluster [Thu Feb 18 02:40:12 2016] INFO [shard 0] gossip - InetAddress 127.0.0.2 is now UP [Thu Feb 18 02:40:13 2016] INFO [shard 0] gossip - do_gossip_to_live_member: Favor newly added node 127.0.0.2 [Thu Feb 18 02:40:13 2016] WARN [shard 0] failure_detector - Not marking nodes down due to local pause of 9091 > 5000 (milliseconds)	2016-02-24 19:31:14 +08:00
Avi Kivity	e993102cb5	Merge "introduce scylla-io-setup.service" from Takuya "Add scylla-io-setup.service to configure max-io-requests and num-io-queues on first boot. Moved SCYLLA_IO configuration code from scylla_sysconfig_setup to scylla-io-setup.service, revert commits related it. On scylla-io-setup.service, autodetect Amazon EC2 instead of using AMI variable on sysconfig."	2016-02-24 10:13:23 +02:00
Takuya ASADA	c4035a0a13	dist: add comment about /etc/scylla.d/io.conf on sysconfig	2016-02-24 04:00:52 +09:00
Takuya ASADA	0f20abb365	Revert "dist: introduce SCYLLA_IO" This reverts commit `5cae2560a3`. Conflicts: dist/common/sysconfig/scylla-server	2016-02-24 03:46:14 +09:00
Takuya ASADA	b79a1a77da	Revert "dist: update SCYLLA_IO with params for AMI" This reverts commit `5494135ddd`. Conflicts: dist/common/scripts/scylla_sysconfig_setup	2016-02-24 03:45:11 +09:00
Takuya ASADA	643beefc8c	Revert "Revert "dist: remove AMI entry from sysconfig, since there is no script refering it"" This reverts commit `21e6720988`.	2016-02-24 03:33:50 +09:00
Takuya ASADA	66c5feb9e9	Revert "dist: align ami option with others (-a --> --ami)" This reverts commit `312f1c9d98`.	2016-02-24 03:33:41 +09:00
Takuya ASADA	a9926f1cea	dist: introduce scylla-io-setup.service to setup io parameters on first startup Signed-off-by: Takuya ASADA <syuu@scylladb.com>	2016-02-24 03:33:03 +09:00
Tomasz Grabiec	79bcb5a616	tests: Fix build of memory_footprint	2016-02-23 19:12:54 +01:00
Amnon Heiman	f461ebc411	idl-compiler: Add pos and rollback to serialize vector This adds the ability to store a position of a serialized vector and to rollback to that stored position afterwards. Signed-off-by: Amnon Heiman <amnon@scylladb.com> Message-Id: <1456041750-1505-3-git-send-email-amnon@scylladb.com>	2016-02-23 17:49:51 +01:00
Amnon Heiman	ea97e07ed7	serialization_visitors: Adding vector_position struct While serialization vector it is sometimes required to rollback some of the serialized elements. vector_position is the equivalent to the bytes_ostream position struct. It holds information about the current position in a serialized vector, the position in the bufffer and the current number of elements serialized. It will allow to rollback to the current point. Signed-off-by: Amnon Heiman <amnon@scylladb.com> Message-Id: <1456041750-1505-2-git-send-email-amnon@scylladb.com>	2016-02-23 17:49:51 +01:00
Tomasz Grabiec	f72fd9eefd	Merge branch 'pdziepak/canonical-mutation-idl/v1' from sesastar-dev.git	2016-02-23 17:02:43 +01:00
Tomasz Grabiec	995b638d96	mutation_partition_visitor: Fix crash for large blobs Fixes #927. The new visiting code builds cell instances using atomic_cell::make_*() factory methods, which won't work in LSA context because they depend on managed_bytes storage to be linearized. It may not be since large blob support. This worked before because we created cells from views before which works in all contexts. Fix by constructing them in standard allocator context. Message-Id: <1456234064-13608-2-git-send-email-tgrabiec@scylladb.com>	2016-02-23 16:41:39 +02:00
Tomasz Grabiec	33cf65c2aa	mutation_partition_view: Fix use-after-move on visitor instance The line: boost::apply_visitor(atomic_cell_or_collection_visitor(std::move(visitor), id, col), cell); is executed in a loop, so visitor could be used after being moved-from. This may not always be allowed for some visitors. Also, vistors may keep state, which should be preserved for the whole visitation. This doesn't fix any issue right now. Message-Id: <1456234064-13608-1-git-send-email-tgrabiec@scylladb.com>	2016-02-23 16:41:39 +02:00
Yoav Kleinberger	f822359d96	bugfix: fixed broken --print-config option Signed-off-by: Yoav Kleinberger <yoav@scylladb.com> Message-Id: <57b452106cdcd9ceb09da4c63781650cefe48040.1456234464.git.yoav@scylladb.com>	2016-02-23 15:35:44 +02:00
Asias He	f7fccc6efb	locator: Fix get token from a range<token> With a range{t1, t2}, if t2 == {}, the range.end() will contain no value. Fix getting t2 in this case. Fixes #911. Message-Id: <4462e499d706d275c03b116c4645e8aaee7821e1.1456128310.git.asias@scylladb.com>	2016-02-23 14:29:26 +01:00
Pekka Enberg	4a4074ad21	tools/scyllatop: Sort metrics by name This makes the output much easier to read, especially if you have tons of metrics specified. Message-Id: <1456230377-3149-1-git-send-email-penberg@scylladb.com>	2016-02-23 14:35:57 +02:00
Takuya ASADA	0f87922aa6	main: notify service start completion ealier, to reduce systemd unit startup time Fixes #910 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1455830245-11782-1-git-send-email-syuu@scylladb.com>	2016-02-23 14:33:16 +02:00
Pekka Enberg	1f6cac8839	tools/scyllatop: Use 'erase' to clear the screen The 'clear' function explicitly clears the screen and repaints it which causes really annoying flicker. Use 'erase' to make scyllatop more pleasant on the eyes. Message-Id: <1456229348-2194-1-git-send-email-penberg@scylladb.com>	2016-02-23 14:12:48 +02:00
Tomasz Grabiec	2b5253927f	test.py: Print output on timeout as well It is often the case that the there is useful debugging information printed by the test before it hangs. It is annoying to see just "TIMED OUT" in jenkins. Print the output always when it is available. In addition to that, we should not interpret all exceptions thrown from communicate() as timeouts. For example, currently ^C sent to the script misleadingly results in "TIMED OUT" to be printed. Message-Id: <1456174992-21909-1-git-send-email-tgrabiec@scylladb.com>	2016-02-23 13:41:11 +02:00
Pekka Enberg	78c6fdf429	cql3/functions: Fix is_pure() for native scalar functions Every native scalar function is already tagged whether they're pure or not but because we don't implement the is_pure() function, all functions end up being advertised as pure. This means that functions like now() that are not pure, end up being evaluated only once. Fixes #571. Message-Id: <1456227171-461-1-git-send-email-penberg@scylladb.com>	2016-02-23 12:37:32 +01:00
Yoav Kleinberger	74fbc62129	ScyllaTop: top-like tool to see live scylla metrics requires a local collectd configured with the unix-sock plugin, use the --help option for more. Run it with: $ scyllatop.py --help Signed-off-by: Yoav Kleinberger <yoav@scylladb.com> Message-Id: <bd3f8c7e120996fc464f41f60130c82e3fb55ac6.1456164703.git.yoav@scylladb.com>	2016-02-23 12:32:47 +02:00
Avi Kivity	8ba474f1c9	Merge "Drop empty partitions from mutation query results" from Tomasz	2016-02-23 11:18:47 +02:00
Tomasz Grabiec	c591157755	tests: mutation_query: Add test for dropping partitions with expired tombstones	2016-02-22 20:23:29 +01:00
Tomasz Grabiec	41d475d9c0	schema_builder: Fluentize property setters	2016-02-22 20:23:29 +01:00
Tomasz Grabiec	6fdaf110d6	mutation_query: Don't include empty partitions In same cases we may have a lot of empty partitions whose tombstones expired, and there is no point in including them in the results. This was found to cause performance issues for workloads using batch updates. system.batchlog table would accumulate a lot of deletes over time. It has gc_grace_seconds set to 0 so most of the tombstones would be expired. mutation queries done by batchlog manager were still returning all partitions present in memtables which caused mutation queries result to be inflated. This in turn was causing mutation_result_merger to take a long time to process them.	2016-02-22 20:21:23 +01:00
Pekka Enberg	4ff1692248	cql3: Make 'CREATE TYPE' error message human readable We don't support the 'CREATE TYPE' statement for now. The user-visible error message, however, is unreadable because our CQL parser doesn't even recognize the statement. cqlsh:ks1> CREATE TYPE config (url text); SyntaxException: <ErrorMessage code=2000 [Syntax error in CQL query] message=" : cannot match to any predicted input... Implement just enough of 'CREATE TYPE' parsing to be able to report a human readable error message if someone tries to execute such statements: cqlsh:ks1> CREATE TYPE config (url text); ServerError: <ErrorMessage code=0000 [Server error] message="User-defined types are not supported yet"> Message-Id: <1456148719-9473-2-git-send-email-penberg@scylladb.com>	2016-02-22 14:50:25 +01:00
Pekka Enberg	d1bbd0271a	cql3: Return const reference from ut_name::get_keyspace() There's no need to copy the string but it does make it more difficult to use get_keyspace() from other places that already return a const reference. Signed-off-by: Pekka Enberg <penberg@scylladb.com> Message-Id: <1456148719-9473-1-git-send-email-penberg@scylladb.com>	2016-02-22 14:50:25 +01:00

1 2 3 4 5 ...

8650 Commits