scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-02 04:56:58 +00:00

Author	SHA1	Message	Date
Takuya ASADA	f227b3faac	dist: On AMI, mark root disk with delete_on_termination Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1454513308-12384-1-git-send-email-syuu@scylladb.com>	2016-02-11 12:19:28 +02:00
Takuya ASADA	33309f667e	dist: enable enhanced networking on AMI Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1454971289-21369-1-git-send-email-syuu@scylladb.com>	2016-02-11 12:18:48 +02:00
Raphael S. Carvalho	ed61fe5831	sstables: make compaction stop report user-friendly When scylla stopped an ongoing compaction, the event was reported as an error. This patch introduces a specialized exception for compaction stop so that the event can be handled appropriately. Before: ERROR [shard 0] compaction_manager - compaction failed: read exception: std::runtime_error (Compaction for keyspace1/standard1 was deliberately stopped.) After: INFO [shard 0] compaction_manager - compaction info: Compaction for keyspace1/standard1 was stopped due to shutdown. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <1f85d4e5c24d23a1b4e7e0370a2cffc97cbc6d44.1455034236.git.raphaelsc@scylladb.com>	2016-02-11 12:16:53 +02:00
Takuya ASADA	8d8130f9c9	dist: fix typo on build_ami.sh We should always run scylla_setup, not just for locally built rpm Fixes #897 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1455103519-13780-1-git-send-email-syuu@scylladb.com>	2016-02-11 11:56:11 +02:00
Shlomi Livne	64f8d5a50e	dist: update packer location Signed-off-by: Shlomi Livne <shlomi@scylladb.com> Message-Id: <3c33ea073f702e00b789930fce9befef03ad9e88.1455178900.git.shlomi@scylladb.com>	2016-02-11 11:52:56 +02:00
Avi Kivity	bfbf89ee31	Merge "Serialize keys in a form independent of in-memory representation" from Tomasz "This series changes the on-wire definitions of keys to be of the following form: class partition_key { std::vector<bytes> exploded(); }; Keys are therefore collections of components. The components are serialized according to the format specified in the CQL binary protocol. No bit depends now on how we store keys in memory. Constructing keys from components currently requires a schema reference, which makes it not possible to deserialize or serialize the keys automatically by RPC. To avoid those complications, compound_type was changed so that it can be constructed and components can be iterated over without schema. Because of this, partition_key size increased by 2 bytes."	2016-02-10 17:54:42 +02:00
Tomasz Grabiec	b74301302c	tests: Add test for key serialization	2016-02-10 15:22:56 +01:00
Tomasz Grabiec	3e2c1840d8	idl: Make key definitions independent of in-memory representation	2016-02-10 15:22:56 +01:00
Tomasz Grabiec	428fce3828	compound: Optimize serialize_single()	2016-02-10 15:22:56 +01:00
Tomasz Grabiec	0cc2832a76	keys: Allow constructing from a range	2016-02-10 15:22:56 +01:00
Tomasz Grabiec	3ffcb998fb	keys: Enable serialization from a range not just a vector	2016-02-10 14:35:14 +01:00
Tomasz Grabiec	095efd01d6	keys: Make from_exploded() and components() work without schema For simplicity, we want to have keys serializable and deserializable without schema for now. We will serialize keys in a generic form of a vector of components where the format of components is specified by CQL binary protocol. So conversion between keys and vector of components needs to be possible to do without schema. We may want to make keys schema-dependent back in the future to apply space optimizations specific to column types. Existing code should still pass schema& to construct and access the key when possible. One optimization had to be reverted in this change - avoidance of storing key length (2 bytes) for single-component partition keys. One consequence of this, in addition to a bit larger keys, is that we can no longer avoid copy when constructing single-component partition keys from a ready "bytes" object. I haven't noticed any significant performance difference in: tests/perf/perf_simple_query -c1 --write It does ~130K tps on my machine.	2016-02-10 14:35:13 +01:00
Tomasz Grabiec	31312722d1	compound: Reduce duplication	2016-02-10 14:35:13 +01:00
Tomasz Grabiec	085d148d6f	compound: Remove unused methods	2016-02-10 14:35:13 +01:00
Tomasz Grabiec	b777cc9565	tests: Fix tests to not rely on key representation	2016-02-10 14:35:13 +01:00
Asias He	6d0407503b	locator: Do not generate wrap-around ranges Like we did in commit `d54c77d5d0`, make the remaining functions in abstract_replication_strategy return non-wrap-around ranges. This fixes: ERROR [shard 0] stream_session - [Stream #f0b7fda0-cf3e-11e5-b6c4-000000000000] stream_transfer_task: Fail to send to 127.0.0.4:0: std::runtime_error (Not implemented: WRAP_AROUND) in streaming. Message-Id: <514d2a9a1d3b868d213464c8858ac5162c0338d8.1455093643.git.asias@scylladb.com>	2016-02-10 10:03:31 +01:00
Avi Kivity	fc6159e2b9	key: tighten partition_key::representation() to return a const managed_bytes& The conversion to bytes_view can fail if the key is scattered; so defer that conversion until later. In a later patch we will intervene before the conversion to ensure the data is linearized.	2016-02-09 19:55:13 +02:00
Avi Kivity	3c60310e38	key: relax some APIs to accept partition_key_view instead of const partition_key& Using a partition_key_view can save an allocation in some cases. We will make use of it when we linearize a partition_key; during the process we are given a simple byte pointer, and constructing a partition_key from that requires an allocation.	2016-02-09 19:55:13 +02:00
Avi Kivity	af8ef54d5a	managed_bytes: introduce with_linearized_managed_bytes() A large managed_bytes blob can be scattered in lsa memory. Usually this is fine, but someone we want to examine it in place without copying it out, but using contiguous iterators for efficiency. For this use case, introduce with_linearized_managed_bytes(Func), which runs a function in a "linearization context". Within the linearization context, reads of managed_bytes object will see temporarily linearized copies instead of scattered data.	2016-02-09 19:55:13 +02:00
Avi Kivity	9f3061ade8	Revert "streaming: Send mutations on all shards" This reverts commit `31d439213c`. Fixes #894. Conflicts: streaming/stream_manager.cc (may have undone part of `63a5aa6122`)	2016-02-09 18:26:14 +02:00
Calle Wilund	18203a4244	database::truncate/drop: Move time stamp generation to shard Fixes #884 Time stamps for truncation must be generated after flush, either by splitting the truncate into two (or more) for-each-shard operations, or simply by doing time stamping per shard (this solution). We generate TS on each shard after flushing, and then rely on the actual stored value to be the highest time point generated. This should however, from batch replay point of view, be functionally equivalent. And not a problem.	2016-02-09 15:45:37 +00:00
Calle Wilund	ce66acc771	system_keyspace: Always retain highest truncation time stamp Since the table is written from all shards, and we possibly might have conflicting time stamps, we define the trucated_at time as the highest time point. I.e. conservative.	2016-02-09 15:45:37 +00:00
Calle Wilund	22a38f0025	db/serializer: Fix db::serializer<replay_position> format Should match struct/"official" serial format. (64+32) This serializer is however not really used any more and could be removed.	2016-02-09 15:45:37 +00:00
Calle Wilund	1c213e1f38	system_keyspace: Use IDL types + better verification of truncation record Truncation records are not portable between us and Origin. We need to detect and ensure we neither try to use, and more to the point, don't crash because of data format error when loading, origin records from a migrated system. This problem was seen by Tzach when doing a migration from an origin setup. Updated record storage to use IDL-serialized types + added versioning and magic marking + odd-size-checking to ensure we load only correct data. The code will also deal with records from an older version of scylla.	2016-02-09 15:45:37 +00:00
Calle Wilund	4d7289b275	serializer_impl: Add convinience wrapper for one-obj deserialization Akin to serizalize_to_buffer	2016-02-09 13:55:33 +00:00
Calle Wilund	dff89fffcd	IDL: Add idl definitions for replay_position and truncation_record	2016-02-09 13:55:33 +00:00
Calle Wilund	873f87430d	database: Check sstable dir name UUID part when populating CF Fixes #870 Only load sstables from CF directories that match the current CF uuid. Message-Id: <1454938450-4338-1-git-send-email-calle@scylladb.com>	2016-02-08 14:48:19 +01:00
Avi Kivity	e5b72aedf1	managed_bytes: don't copy data during hashing	2016-02-08 12:43:05 +02:00
Avi Kivity	5d958db869	managed_bytes: fix operator== for fragmented blobs Must compare fragment by fragment.	2016-02-08 12:43:05 +02:00
Calle Wilund	2ffd7d7b99	stream_manager: Change construction to make gcc 4.9 happy gcc 4.9 complains about the type{ val, val } construction of type with implicit default constructor, i.e. member = initial declarations. gcc 5 does not (and possibly rightly so). However, we still (implicitly) claim to support gcc 4.9 so why not just change this particular instance. Message-Id: <1454921328-1106-1-git-send-email-calle@scylladb.com>	2016-02-08 10:54:48 +02:00
Paweł Dziepak	c90ec731c8	transport: do not close gate at connection shutdown connection::_pending_requests_gate is responsible for keeping connection objects alive as long as there are outstanding requests and is closed in connection::proccess() when needed. Closing it in connection::shutdown() as well may cause the gate to be closed twice what is a bug. Fixes #690. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com> Message-Id: <1454596390-23239-1-git-send-email-pdziepak@scylladb.com>	2016-02-07 20:07:23 +02:00
Avi Kivity	8b0a26f06d	build: support for alternative versions of libsystemd pkgconfig While pkgconfig is supposed to be a distribution and version neutral way of detecting packages, it doesn't always work this way. The sd_notify() manual page documents that sd_notify is available via the libsystemd package, but on centos 7.0 it is only available via the libsystemd-daemon package (on centos 7.1+ it works as expected). Fix by allowing for alternate version of package names, testing each one until a match is found. Fixes #879. Message-Id: <1454858862-5239-1-git-send-email-avi@scylladb.com>	2016-02-07 17:36:57 +02:00
Avi Kivity	ad58663c96	row_cache: reindent	2016-02-07 13:25:29 +02:00
Asias He	31d439213c	streaming: Send mutations on all shards Currently, only the shard where the stream_plan is created on will send streaing mutations. To utilize all the available cores, we can make each shard send mutations which it is responsbile for. On the receiver side, we do not forward the mutations to the shard where the stream_session is created, so that we can avoid unnecessary forwarding. Note: the downside is that it is now harder to: 1) to track number of bytes sent and received 2) to update the keep alive timer upon receive of the STREAM_MUTATION To fix, we now store the sent/recieved bytes info on all shards. When the keep alive timer expires, we check if any progress has been made. Hopefully, this patch will make the streaming much faster and in turn make the repair/decommission/adding a node faster. Refs: https://github.com/scylladb/scylla/issues/849 Tested with decommission/repair dtest. Message-Id: <96b419ab11b736a297edd54a0b455ffdc2511ac5.1454645370.git.asias@scylladb.com>	2016-02-07 10:57:51 +02:00
Gleb Natapov	63a5aa6122	prevent superfluous frozen_mutation copying Sometimes frozen_mutation is copied while it can be moved instead. Fix those cases. Message-Id: <20160204165708.GI6705@scylladb.com>	2016-02-07 10:54:16 +02:00
Erich Keane	4197ceeedb	raw_statement::is_reversed rewrite to avoid VLA The is_reversed function uses a variable length array, which isn't spec-abiding C++. Additionally, the Clang compiler doesn't allow them with non-POD types, so this function wouldn't compile. After reading through the function it seems that the array wasn't necessary as the check could be calculated inline rather than separately. This version should be more performant (since it no longer requires the VLA lookup performance hit) while taking up less memory in all but the smallest of edge-cases (when the clustering_key_size * sizeof(optional<bool>) < sizeof(size_type) - sizeof(uint32_t) + sizeof(bool). This patch uses relation_order_unsupported it assure that the exception order is consistent with the preivous version. The throw would otherwise be moved into the initial for-loop. There are two derrivations in behavior: The first is the initial assert. It however should not change the apparent behavior besides causing orderings() to be looked up 2x in debug situations. The second is the conversion of is_reversed_ from an optional to a bool. The result is that the final return value is now well-defined to be false in the release-condition where orderings().size() == 0, rather than be the ill-defined *is_reversed_ that was there previously. Signed-off-by: Erich Keane <erich.keane@verizon.net> Message-Id: <1454546285-16076-4-git-send-email-erich.keane@verizon.net>	2016-02-07 10:38:17 +02:00
Erich Keane	49842aacd9	managed_vector: maybe_constructed ctor to non-constexpr Clang enforces that a union's constexpr CTOR must initialize one of the members. The spec is seemingly silent as to what the rule on this is, however, making this non-constexpr results in clang accepting the constructor. Signed-off-by: Erich Keane <erich.keane@verizon.net> Message-Id: <1454604300-1673-1-git-send-email-erich.keane@verizon.net>	2016-02-07 10:30:45 +02:00
Erich Keane	e87019843f	Fix PHI_FACTOR definition to be spec compliant PHI_FACTOR is a constexpr variable that is defined using std::log. Though G++ has a constexpr version of std::log, this itself is not spec complaint (in fact, Clang enforces this). See C++ Spec 26.8 for the definition of std::log and 17.6.5.6 for the rule regarding adding constexpr where it isn't specified. This patch replaces the std::log statement with a version from math.h that contains the exact value (M_LOG10El). Signed-off-by: Erich Keane <erich.keane@verizon.net> Message-Id: <1454603285-32677-1-git-send-email-erich.keane@verizon.net>	2016-02-04 18:33:44 +02:00
Avi Kivity	c85f6c4df1	Merge seastar upstream * seastar 661ccd9...14c9991 (1): > reactor: use correct open_flags when opening a file without DMA support Fixes #871.	2016-02-04 18:17:04 +02:00
Gleb Natapov	77d47c0c4b	optimize serialization of array/vector of integral types Array of integral types on little endian machine can be memcpyed into/out of a buffer instead of serialized/deserialized element by element. Message-Id: <20160204155425.GC6705@scylladb.com>	2016-02-04 18:01:14 +02:00
Avi Kivity	91fbb81477	Merge seastar upstream * seastar f8beab9...661ccd9 (1): > Merge "Use swapcontext() with AddressSanitizer" from Paweł	2016-02-04 17:30:15 +02:00
Paweł Dziepak	ababdfc9e2	tests/batchlog: use proper batchlog version Since `42e3999a00` "Check batchlog version before replaying" there is a version check in batchlog replay. However, the test wasn't updated and still used some arbitrary version number which caused it to fail. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com> Message-Id: <1454595368-21670-1-git-send-email-pdziepak@scylladb.com>	2016-02-04 16:50:45 +02:00
Gleb Natapov	049ae37d08	storage_proxy: change collectd to show foreground mutation instead of overall mutation count It is much easier to see what is going on this way otherwise graphs for bg mutations and overall mutations are very close with usual scaling for many workloads. Message-Id: <20160204083452.GH6705@scylladb.com>	2016-02-04 14:58:56 +02:00
Gleb Natapov	a9e4afd8d2	Drop query-result.hh from database.hh It is not needed there but causes a lot of recompilation when changed. Message-Id: <1454496142-14537-3-git-send-email-gleb@scylladb.com>	2016-02-04 13:22:27 +02:00
Gleb Natapov	2ae1ae2d18	Cleanup messaging_service.hh includes a bit. Forward declare some classes instead. Message-Id: <1454496142-14537-2-git-send-email-gleb@scylladb.com>	2016-02-04 13:22:24 +02:00
Avi Kivity	f3ca597a01	Merge "Sstable cleanup fixes" from Tomasz " - Added waiting for async cleanup on clean shutdown - Crash in the middle of sstable removal doesn't leave system in a non-bootable state"	2016-02-04 12:36:13 +02:00
Tomasz Grabiec	c7ef3703cc	sstable: Make sstable deletion never leave sstable set in a non-bootable state Refs #860 Refs #802 An sstable file set with any component missing is interpreted as a critical error during boot. Currently sstable removal procedure could leave the files in a non-bootable state if the process crashed after TOC was removed but before all components were removed as well. To solve this problem, start the removal by renaming the TOC file to a so called "temporary TOC". Upon boot such kind of TOC file is interpreted as an sstable which is safe to remove. This kind of TOC was added before to deal with a similar scenario but in the opposite direction - when writing a new sstable.	2016-02-03 17:36:17 +01:00
Tomasz Grabiec	c8a98b487c	sstables: Remove coupling-hiding duplication	2016-02-03 17:36:17 +01:00
Tomasz Grabiec	355874281a	sstables: Do not register exit hooks from static initializer Fixes #868. Registerring exit hooks while reactor is already iterating over exit hooks is not allowed and currently leads to undefined behavior observed in #868. While we should make the failure more user friendly, registering exit hooks concurrently with shutdown will not be allowed. We don't expect exit hooks to be registered after exit starts because this would violate the guarantee which says that exit hooks are executed in reverse order of registration. Starting exit sequence in the middle of initialization sequence would result in use after free errors. Btw, I'm not sure if currently there's anything which prevents this To solve this problem, move the exit hook to initilization sequence. In case of tests, the cleanup has to be called explicitly.	2016-02-03 17:35:50 +01:00
Tomasz Grabiec	136c9d9247	sstables: Improve error message in case of generation duplication Refs #870.	2016-02-03 17:35:50 +01:00

... 64 65 66 67 68 ...

11716 Commits