scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-23 10:00:35 +00:00

Author	SHA1	Message	Date
Asias He	edd72e10ac	repair: Introduce get_sync_boundary_response The return value of the REPAIR_GET_SYNC_BOUNDARY verb. It will be used in the row level repair code soon.	2018-12-12 16:49:01 +08:00
Asias He	95b9a889cf	repair: Introduce repair_hash It represents the hash value of a repair row.	2018-12-12 16:49:01 +08:00
Asias He	3e86b7a646	repair: Introduce repair_sync_boundary Represent a position of a mutation_fragment read from a flat mutation reader. Repair nodes negotiate a small sub range identified by two repair_sync_boundary to work on in each round.	2018-12-12 16:49:01 +08:00
Asias He	063dfcda26	messaging_service: Add constructor for msg_addr Which takes the ip address and shard id.	2018-12-12 16:49:01 +08:00
Asias He	8cb3ea98d0	xx_hasher: Allow specifying seed It will be used by row level repair.	2018-12-12 16:49:01 +08:00
Asias He	165d3053b1	position_in_partition: Add get_type, get_bound_weight and get_clustering_key_prefix Needed by the RPC serialization code.	2018-12-12 16:49:01 +08:00
Asias He	4e55d22a8f	position_in_partition: Switch _bound_weight to use enum The _bound_weight in position_in_partition will be sent on wire in rpc. Make it enum instead of int.	2018-12-12 16:49:01 +08:00
Asias He	5bc109e1ee	position_in_partition: Add bound_weight It will be used to change _bound_weight to use enum instead of int8_t.	2018-12-12 16:49:01 +08:00
Asias He	05c663b932	position_in_partition: Use std::optional for clustering_key_prefix The new row level repair code will access clustering_key_prefix and it uses std::optional everywhere. Convert position_in_partition to use std::optional.	2018-12-12 16:49:01 +08:00
Asias He	0b31d7059b	position_in_partition: Make partition_region uint8_t It will be sent over rpc. Make the type explicit.	2018-12-12 16:49:01 +08:00
Asias He	dfd206b3a3	serializer: Add std::optional support	2018-12-12 16:49:01 +08:00
Asias He	3eecdc670f	serializer: Add std::list support Needed by the row level repair RPC verbs.	2018-12-12 16:49:01 +08:00
Asias He	b540df2819	serializer: Add std::unordered_set support Needed by the row level repair RPC verbs.	2018-12-12 16:49:01 +08:00
Asias He	1367c8c47e	dht: Add make_partitioner Given the name and shard count and the sharding_ignore_msb_bits, make a partitioner. It is used by row level repair.	2018-12-12 16:49:01 +08:00
Asias He	f1a914060b	dht: Add constructor for decorated_key which takes token and partition_key decorated_key(const dht::token& t, const partition_key& k)	2018-12-12 16:49:01 +08:00
Juliana Oliveira	5eb76c9bc6	compress: add support for Cassandra's compression parameter This patch adds compatibility for Cassandra's "chunk_size_in_kb", as well as it keeps Scylla's "chunk_size_kb" compression parameter. Fixes #3669 Tests: unit (release) v2: use variable instead of array v3: fix commited files Signed-off-by: Juliana Oliveira <juliana@scylladb.com> Message-Id: <20181211215840.GA7379@shenzou.localdomain>	2018-12-11 23:33:27 +00:00
Nadav Har'El	a0379209e6	secondary indexes: fail attempts to create a CUSTOM INDEX Cassandra supports a "CREATE CUSTOM INDEX" to create a secondary index with a custom implementation. The only custom implementation that Cassandra supports is SASI. But Scylla doesn't support this, or any other custom index implementation. If a CREATE CUSTOM INDEX statement is used, we shouldn't silently ignore the "CUSTOM" tag, we should generate an error. This patch also includes a regression test that "CREATE CUSTOM INDEX" statements with valid syntax fail (before this patch, they succeeded). Fixes #3977 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20181211224545.18349-2-nyh@scylladb.com>	2018-12-11 23:33:02 +00:00
Nadav Har'El	36db4fba23	Fix typo in error message Interestingly, this typo was copied from the original Cassandra source code :-) Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20181211224545.18349-1-nyh@scylladb.com>	2018-12-11 23:32:58 +00:00
Avi Kivity	5b08e91bdb	tools: add SYS_PTRACE capability to dbuild LeakSanitizer uses ptrace, and docker disables ptrace by default. Add it back so tests pass. Message-Id: <20181208112524.19229-1-avi@scylladb.com>	2018-12-11 19:09:12 +00:00
Avi Kivity	34a31a807d	build: build libdeflate with user selected C compiler If the user specified a C compiler, use it to build libdeflate. Fixes #3978. Message-Id: <20181211145604.14847-1-avi@scylladb.com>	2018-12-11 14:58:16 +00:00
Duarte Nunes	89ae3fbf11	db/system_distributed_keyspace: Create the schema with min_timestamp Different nodes can concurrently create the distributed system keyspace on boot, before the "if not exists" clause can take effect. However, the resulting schema mutations will be different since different nodes use different timestamps. This patch forces the timestamps to be the same across all nodes, so we save some schema mismatches. This fixes a bug exposed by `ca5dfdf`, whereby the initialization of the distributed system keyspace is done before waiting for schema agreement. While waiting for schema agreement in storage_service::join_token_ring(), the node still hasn't joined the ring and schemas can't be pulled from it, so nodes can deadlock. A similar situation can happen between a seed node and a non-seed node, where the seed node progresses to a different "wait for schema agreement" barrier, but still can't make progress because it can't pull the schema from the non-seed node still trying to join the ring. Finally, it is assumed that changes to the schema of the current distributed system keyspace tables will be protected by a cluster feature and a subsequent schema synchronization, such that all nodes will be at a point where schemas can be transferred around. Fixes #3976 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20181211113407.20075-1-duarte@scylladb.com>	2018-12-11 13:35:48 +01:00
Paweł Dziepak	e3f53542c9	Merge "Optimize sstable writing of large partitions" from Tomasz " This series contains several optimizations of the MC format sstable writer, mainly: - Avoiding output_stream when serializing into memory (e.g. a row) - Faster serialization of primitive types when serializing into memory I measured the improvement in throughput (frag/s) using perf_fast_forward for datasets with a single large partition with many small rows: - 10% for a row with a single cell of 8 bytes - 10% for a row with a single cell of 100 bytes - 9% for a row with a single cell of 1000 bytes - 13% for a row with 6 cells of 100 bytes " * tag 'avoid-output-stream-in-sstable-writer-v2' of github.com:tgrabiec/scylla: bytes_ostream: Optimize writing of fixed-size types sstables: mc: Write temporary data to bytes_ostream rather than file_writer sstables: mc: Avoid double-serialization of a range tombstone marker sstables: file_writer: Generalize bytes& writer to accept bytes_view sstables: Templetize write() functions on the writer sstables: Turn m_format_write_helpers.cc into an impl header sstables: De-futurize file_writer bytes_ostream: Implement clear() bytes_ostream: Make initial chunk size configurable	2018-12-11 12:29:24 +00:00
Duarte Nunes	d66bd0100b	Merge 'Simplify db::extensions' from Avi " Carry out simplifications of db::extensions: less magical types, de-inline complex functions, and reduce #include dependencies Tests: unit(release) " * tag 'extensions-simplify/v1' of https://github.com/avikivity/scylla: extensions: remove unneeded includes extensions: deinline extension accessors extensions: return concrete types from the extension accessors extensions: remove dependency on cql layer	2018-12-10 22:00:51 +00:00
Avi Kivity	b251183359	extensions: remove unneeded includes <boost/any.hpp> is not used, and "schema.hh" can be replaced with forward declarations.	2018-12-10 21:34:09 +02:00
Avi Kivity	119a83bf2f	extensions: deinline extension accessors Quite complex code that is not performance sensitive. Move it out of line.	2018-12-10 21:22:56 +02:00
Avi Kivity	e9f5641b64	extensions: return concrete types from the extension accessors Returning "auto" makes it harder to understand what the function is returning, and impossible to de-inline. Return a vector of pointers instead. The caller should iterate immediately, in any case, and since the previous return value was a range of references to const unique_ptrs, nothing else could be done with it anyway.	2018-12-10 21:16:45 +02:00
Tomasz Grabiec	f206ef0038	bytes_ostream: Optimize writing of fixed-size types Inlining write() allows the writing code to be optimized for fixed-size types. In particular, memcpy() calls and loops will be eliminated. Saw 4% improvement in throughput in perf_fast_forward for tiny rows.	2018-12-10 20:08:16 +01:00
Tomasz Grabiec	5a35240d47	sstables: mc: Write temporary data to bytes_ostream rather than file_writer Currently temporary data is serialized into a file_writer, because that's what write() functions used to expect, which goes through an output_stream, a data_sink, into an in-memory data sink implementation which collects the temporary_buffers. Going through those abstractions is relatively expensive if we don't write much, because each time we begin to write after a flush() of the file_writer the output stream has to allocate a new buffer, which means a large allocation for small amount of data. We could avoid that and write into bytes_ostream directly, which will keep its buffer across clear(). write() functions which are used both to write directly into the data file and to a temporary arena were templatized to accept a Writer to which both file_writer and bytes_ostream conform.	2018-12-10 20:08:16 +01:00
Tomasz Grabiec	c4003b3e79	sstables: mc: Avoid double-serialization of a range tombstone marker	2018-12-10 20:08:16 +01:00
Tomasz Grabiec	9edb9434e5	sstables: file_writer: Generalize bytes& writer to accept bytes_view Note that bytes is imlpicitly convertible to bytes_view.	2018-12-10 20:08:16 +01:00
Tomasz Grabiec	fad4fba4bc	sstables: Templetize write() functions on the writer Will allow writing to both a file_writer, or an in-memory writer like a bytes_ostream.	2018-12-10 20:08:16 +01:00
Tomasz Grabiec	f4016996d3	sstables: Turn m_format_write_helpers.cc into an impl header I need to templatize functions defined in it and want to avoid explicit instantiations. There is only one compilation unit in which this is used (sstables.cc). I think in the long term we should move all those "helpers" into sstables/mc/writer.{cc,hh} together with their only user, the sstable_writer_m class from sstables.cc.	2018-12-10 20:07:43 +01:00
Tomasz Grabiec	13999a4d09	sstables: De-futurize file_writer	2018-12-10 20:07:43 +01:00
Tomasz Grabiec	a1fb441df8	bytes_ostream: Implement clear()	2018-12-10 20:07:43 +01:00
Tomasz Grabiec	7cf5de3d9c	bytes_ostream: Make initial chunk size configurable	2018-12-10 20:07:43 +01:00
Avi Kivity	8e05bcbe71	extensions: remove dependency on cql layer The extensions class reaches into cql's property_definitions class to grab a map<sstring, sstring> type. This generates a few unneeded dependencies. Reduce dependencies by defining the map type ourselves; if cql's property_definitions changes in an incompatible way, it will have to adapt, rather than the extensions class.	2018-12-10 20:55:30 +02:00
Tomasz Grabiec	1dd2bf52ca	Merge "Add a couple of tests of broken sstables" From Rafael These are the current uninteresting cases I found when looking at malformed_sstable_exception. The existing code is working, just not being tested. * https://github.com/espindola/scylla.git espindola/espindola/broken-sst: Add a broken sstable test. Add a test with mismatched schema.	2018-12-10 19:30:58 +01:00
Tomasz Grabiec	538e041f22	Merge "Remove some dependencies on db::config" from Avi db::config is a global class; changes in any module can cause changes in db::config. Therefore, it is a cause of needless recompilation. Remove some of these dependencies by having consumers of db::config declare an intermediate config struct that is contains only configuration of interest to them, and have their caller fill it out (in the case of auth, it already followed this scheme and the patchset only moves the translation function). In addition, some outright pointless inclusions of db/config.hh are removed. The result is somewhat shorter compile times, and fewer needless recompiles. * https://github.com/avikivity/scylla unconfig-1/v1: config: remove inclusions of db/config.hh from header files repair: remove unneeded config.hh inclusion batchlog_manager: remove dependency on db::config auth: remove permissions_cache dependency on db::config auth: remove auth::service dependency on db::config auth: remove unneeded db/config.hh includes	2018-12-10 14:53:14 +01:00
Benny Halevy	ef53ddf3ae	scylla_io_setup: correct units in low space warning GiB -> GB Refs #2676 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20181210092503.10344-1-bhalevy@scylladb.com>	2018-12-10 13:58:49 +02:00
Avi Kivity	475b151c97	Merge "Use utils::small_vector more in read path" from Paweł " This series optimises the read path by replacing some usages of std::vector by utils::small_vector. The motivation for this change was an observation that memory allocation functions are pointed out by the profiler as the ones where we spent most time and while they have a large number of callers storage allocation for some vectors was close to the top. The gains are not huge, since the problem is a lot of things adding up and not a single slow thing, but we need to start with something. Unfortunately, the performance of boost::container::small_vector is quite disappointing so a new implementation of a small_vector was introduced. perf_simple_query -c4 --duration 60, medians: ./perf_before ./perf_after diff read 343086.80 360720.53 5.1% Tests: unit(release, small_vector in debug) " * tag 'small_vector/v2.1' of https://github.com/pdziepak/scylla: partition_slice: use small_vector for column_ids mutation_fragment_merger: use small_vector auth: use small_vector in resource auth: avoid list-initialisation of vectors idl: serialiser: add serialiser for utils::small_vector idl: serialiser: deduplicate vector serialisers utils: introduce small_vector intrusive_set_external_comparator: make iterator nothrow move constructible mutation_fragment_merger: value-initialise iterator	2018-12-10 13:50:59 +02:00
Duarte Nunes	a42b2895c2	Merge branch 'gossip: Send node UP event to cql client after cql server is up' from Asias " This is a backport of CASSANDRA-8236. Before this patch, scylla sends the node UP event to cql client when it sees a new node joins the cluster, i.e., when a new node's status becomes NORMAL. The problem is, at this time, the cql server might not be ready yet. Once the client receives the UP event, it tries to connect to the new node's cql port and fails. To fix, a new application_sate::RPC_READY is introduced, new node sets RPC_READY to false when it starts gossip in the very beginning and sets RPC_READY to true when the cql server is ready. The RPC_READY is a bad name but I think it is better to follow Cassandra. Nodes with or without this patch are supposed to work together with no problem. Refs #3843 " * 'asias/node_up_down.upstream.v4.1' of github.com:scylladb/seastar-dev: storage_service: Use cql_ready facility storage_service: Handle application_state::RPC_READY storage_service: Add notify_cql_change storage_service: Add debug log in notify_joined storage_service: Add extra check in notify_joined storage_service: Add notify_joined storage_service: Add debug log in notify_up storage_service: Add extra check in notify_up storage_service: Add notify_up storage_service: Make notify_left log debug level storage_service: Introduce notify_left storage_service: Add debug log in notify_down storage_service: Introduce notify_down storage_service: Add set_cql_ready gossip: Add gossiper::is_cql_ready gms: Add endpoint_state::is_cql_ready gms: Add application_state::RPC_READY gms: Introduce cql_ready in versioned_value	2018-12-10 11:37:59 +00:00
Asias He	06dc9b8da0	storage_service: Use cql_ready facility At this point the cql_ready facility is ready. To use it, advertise the RPC_READY application state in the following cases: - When a node boots, set it to false - When cql server is ready, set it to true - When cql server is down, set it to false	2018-12-10 19:20:20 +08:00
Asias He	4761b53035	storage_service: Handle application_state::RPC_READY	2018-12-10 19:20:20 +08:00
Asias He	0e64814206	storage_service: Add notify_cql_change It is called when a RPC_READY gossip application state is received.	2018-12-10 19:20:20 +08:00
Asias He	a1bbd7bcc7	storage_service: Add debug log in notify_joined	2018-12-10 19:20:20 +08:00
Asias He	17d68cb408	storage_service: Add extra check in notify_joined Do not send node joined event if node is not in NORMAL status which means the node has joined the cluster officially.	2018-12-10 19:20:20 +08:00
Asias He	9abb15192f	storage_service: Add notify_joined Add a helper for node joined event.	2018-12-10 19:20:20 +08:00
Asias He	60c74431f7	storage_service: Add debug log in notify_up	2018-12-10 19:20:20 +08:00
Asias He	948d2b6c78	storage_service: Add extra check in notify_up Do not send up event if is_cql_ready is false which means cql server is not ready yet or node is down.	2018-12-10 19:20:20 +08:00
Asias He	48cd31dc1e	storage_service: Add notify_up Add a helper for node up event.	2018-12-10 19:20:20 +08:00

1 2 3 4 5 ...

17341 Commits