scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-12 19:02:12 +00:00

Author	SHA1	Message	Date
Glauber Costa	e885eacbe4	column_family: do not open code generation calculation We already have a function that wraps this, re-use it. This FIXME is still relevant, so just move it there. Let's not lose it. Signed-off-by: Glauber Costa <glauber@scylladb.com> (cherry picked from commit `94e90d4a17`)	2016-03-14 15:51:06 +02:00
Glauber Costa	3f67277804	colum_family: remove mutation_count We use memory usage as a threshold these days, and nowhere is _mutation_count checked. Get rid of it. Signed-off-by: Glauber Costa <glauber@scylladb.com> (cherry picked from commit `46fdeec60a`)	2016-03-14 15:50:57 +02:00
Asias He	05aea2b65a	storage_service: Fix pending_range_calculator_service Since calculate_pending_ranges will modify token_metadata, we need to replicate to other shards. With this patch, when we call calculate_pending_ranges, token_metadata will be replciated to other non-zero shards. In addition, it is not useful as a standalone class. We can merge it into the storage_service. Kill one singleton class. Fixes #1033 Refs #962 Message-Id: <fb5b26311cafa4d315eb9e72d823c5ade2ab4bda.1457943074.git.asias@scylladb.com> (cherry picked from commit `9f64c36a08`)	2016-03-14 14:39:39 +02:00
Vlad Zolotarov	a2751a9592	sstables: properly account removal requests The same shard may create an sstables::sstable object for the same SStable that doesn't belong to it more than once and mark it for deletion (e.g. in a 'nodetool refresh' flow). In that case the destructor of sstables::sstable accounted the deletion requests from the same shard more than once since it was a simple counter incremented each time there was a deletion request while it should account request from the same shard as a single request. This is because the removal logic waited for all shards to agree on a removal of a specific SStable by comparing the counter mentioned above to the total number of shards and once they were equal the SStable files were actually removed. This patch fixes this by replacing the counter by an std::unordered_set<unsigned> that will store a shard ids of the shards requesting the deletion of the sstable object and will compare the size() of this set to smp::count in order to decide whether to actually delete the corresponding SStable files. Fixes #1004 Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com> Message-Id: <1457886812-32345-1-git-send-email-vladz@cloudius-systems.com> (cherry picked from commit `ce47fcb1ba`)	2016-03-14 14:38:17 +02:00
Raphael S. Carvalho	eda8732b8e	sstables: make write_simple() safer by using exclusive flag We should guarantee that write_simple() will not try to overwrite an existing file. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <194bd055f1f2dc1bb9766a67225ec38c88e7b005.1457818073.git.raphaelsc@scylladb.com> (cherry picked from commit `1ff7d32272`)	2016-03-14 14:38:07 +02:00
Raphael S. Carvalho	b24f5ece1f	sstables: fix race condition when writing to the same sstable in parallel When we are about to write a new sstable, we check if the sstable exists by checking if respective TOC exists. That check was added to handle a possible attempt to write a new sstable with a generation being used. Gleb was worried that a TOC could appear after the check, and that's indeed possible if there is an ongoing sstable write that uses the same generation (running in parallel). If TOC appear after the check, we would again crap an existing sstable with a temporary, and user wouldn't be to boot scylla anymore without manual intervention. Then Nadav proposed the following solution: "We could do this by the following variant of Raphael's idea: 1. create .txt.tmp unconditionally, as before the commit `031bf57c1` (if we can't create it, fail). 2. Now confirm that .txt does not exist. If it does, delete the .txt.tmp we just created and fail. 3. continue as usual 4. and at the end, as before, rename .txt.tmp to .txt. The key to solving the race is step 1: Since we created .txt.tmp in step 1 and know this creation succeeded, we know that we cannot be running in parallel with another writer - because such a writer too would have tried to create the same file, and kept it existing until the very last step of its work (step 4)." This patch implements the solution described above. Let me also say that the race is theoretical and scylla wasn't affected by it so far. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <ef630f5ac1bd0d11632c343d9f77a5f6810d18c1.1457818331.git.raphaelsc@scylladb.com> (cherry picked from commit `0af786f3ea`)	2016-03-14 14:37:58 +02:00
Raphael S. Carvalho	1322ec6d6b	sstables: bail out if toc exists for generation used by write_components Currently, if sstable::write_components() is called to write a new sstable using the same generation of a sstable that exists, a temporary TOC will be unconditionally created. Afterwards, the same sstable::write_components() will fail when it reaches sstable::create_data(). The reason is obvious because data component exists for that generation (in this scenario). After that, user will not be able to boot scylla anymore because there is a generation with both a TOC and a temporary TOC. We cannot simply remove a generation with TOC and temporary TOC because user data will be lost (again, in this scenario). After all, the temporary TOC was only created because sstable::write_components() was wrongly called with the generation of a sstable that exists. Solution proposed by this patch is to trigger exception if a TOC file exists for the generation used. Some SSTable unit tests were also changed to guarantee that we don't try to overwrite components of an existing sstable. Refs #1014. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <caffc4e19cdcf25e4c6b9dd277d115422f8246c4.1457643565.git.raphaelsc@scylladb.com> (cherry picked from commit `031bf57c19`)	2016-03-14 14:37:50 +02:00
Glauber Costa	efbf51c00b	sstables: improve error messages The standard C++ exception messages that will be thrown if there is anything wrong writing the file, are suboptimal: they barely tell us the name of the failing file. Use a specialized create function so that we can capture that better. Signed-off-by: Glauber Costa <glauber@scylladb.com> (cherry picked from commit `f2a8bcabc2`)	2016-03-14 14:37:41 +02:00
Pekka Enberg	5d901b19c4	main: Initialize system keyspace earlier We start services like gossiper before system keyspace is initialized which means we can start writing too early. Shuffle code so that system keyspace is initialized earlier. Refs #1014 Message-Id: <1457593758-9444-1-git-send-email-penberg@scylladb.com> (cherry picked from commit `5dd1fda6cf`)	2016-03-14 13:47:18 +02:00
Tomasz Grabiec	7085fc95d1	log: Fix operator<<(std::ostream&, const std::exception_ptr&) Attempt to print std::nested_exception currently results in exception to leak outside the printer. Fix by capturing all exception in the final catch block. For nested exception, the logger will print now just "std::nested_exception". For nested exceptions specifically we should log more, but that is a separate problem to solve. Message-Id: <1457532215-7498-1-git-send-email-tgrabiec@scylladb.com> (cherry picked from commit `838a038cbd`)	2016-03-09 16:11:14 +02:00
Pekka Enberg	776908fbf6	types: Implement to_string for timestamps and dates The to_string() function is used for logging purpose so use boost to_iso_extended_string() to format both timestamps and dates. Fixes #968 (showstopper) Message-Id: <1457528755-6164-1-git-send-email-penberg@scylladb.com> (cherry picked from commit `ab502bcfa8`)	2016-03-09 16:10:02 +02:00
Gleb Natapov	5f7f276ef6	fix EACH_QUORUM handling during bootstrapping Currently write acknowledgements handling does not take bootstrapping node into account for CL=EACH_QUORUM. The patch fixes it. Fixes #994 Message-Id: <20160307121620.GR2253@scylladb.com> (cherry picked from commit `626c9d046b`)	2016-03-08 13:35:10 +02:00
Paweł Dziepak	5a38f3cbfd	lsa: set _active to nullptr in region destructor In region destructor, after active segments is freed pointer to it is left unchanged. This confuses the remaining parts of the destructor logic (namely, removal from region group) which may rely on the information in region_impl::_active. In this particular case the problem was that code removing from the region group called region_impl::occupancy() which was dereferencing _active if not null. Fixes #993. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com> Message-Id: <1457341670-18266-1-git-send-email-pdziepak@scylladb.com> (cherry picked from commit `99b61d3944`)	2016-03-08 13:32:30 +02:00
Tomasz Grabiec	2d4309a926	validation: Fix validation of empty partition key The validation was wrongly assuming that empty thrift key, for which the original C* code guards against, can only correspond to empty representation of our partition_key. This no longer holds after: commit `095efd01d6` "keys: Make from_exploded() and components() work without schema" This was responsible for dtest failure: cql_additional_tests.TestCQL:column_name_validation_test (cherry picked from commit `100b540a53`)	2016-03-08 11:42:14 +02:00
Tomasz Grabiec	988d6cd153	cql3: Fix handling of lists with static columns List operations and prefetching were not handling static columns correctly. One issue was that prefetching was attaching static column data to row data using ids which might overlap with clustered columns. Another problem was that list operations were always constructing clustering key even if they worked on a static column. For static columns the key would be always empty and lookup would fail. The effect was that list operations which depend on curent state had no effect. Similar problem could be observed on C* 2.1.9, but not on 2.2.3. Fixes #903. (cherry picked from commit `383296c05b`)	2016-03-06 11:06:03 +02:00
Pekka Enberg	bf71575fd7	release: prepare for 0.18.1 scylla-0.18.1	2016-03-05 08:53:07 +02:00
Gleb Natapov	cd75075214	storage_proxy: fix race between read cl completion and timeout in digest resolver If timeout happens after cl promise is fulfilled, but before continuation runs it removes all the data that cl continuation needs to calculate result. Fix this by calculating result immediately and returning it in cl promise instead of delaying this work until continuation runs. This has a nice side effect of simplifying digest mismatch handling and making it exception free. Fixes #977. Message-Id: <1457015870-2106-3-git-send-email-gleb@scylladb.com> (cherry picked from commit `b89b6f442b`)	2016-03-03 17:10:38 +02:00
Gleb Natapov	e85f11566b	storage_proxy: store only one data reply in digest resolver. Read executor may ask for more than one data reply during digest resolving stage, but only one result is actually needed to satisfy a query, so no need to store all of them. Message-Id: <1457015870-2106-2-git-send-email-gleb@scylladb.com> (cherry picked from commit `e4ac5157bc`)	2016-03-03 17:10:32 +02:00
Gleb Natapov	8f682f018e	storage_proxy: fix cl achieved condition in digest resolver timeout handler In digest resolver for cl to be achieved it is not enough to get correct number of replies, but also to have data reply among them. The condition in digest timeout does not check that, fortunately we have a variable that we set to true when cl is achieved, so use it instead. Message-Id: <1457015870-2106-1-git-send-email-gleb@scylladb.com> (cherry picked from commit `69b61b81ce`)	2016-03-03 17:10:26 +02:00
Tomasz Grabiec	dba2b617e7	db: Fix error handling in populate_keyspace() When find_uuid() fails Scylla would terminate with: Exiting on unhandled exception of type 'std::out_of_range': _Map_base::at But we are supposed to ignore directories for unknown column families. The try {} catch block is doing just that when no_such_column_family is thrown from the find_column_family() call which follows find_uuid(). Fix by converting std::out_of_range to no_such_column_family. Message-Id: <1456056280-3933-1-git-send-email-tgrabiec@scylladb.com>	2016-03-03 11:37:26 +02:00
Paweł Dziepak	f4e11007cf	Revert "do not use boost::multiprecision::msb()" This reverts commit `dadd097f9c`. That commit caused serialized forms of varint and decimal to have some excess leading zeros. They didn't affect deserialization in any way but caused computed tokens to differ from the Cassandra ones. Fixes #898. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com> Message-Id: <1455537278-20106-1-git-send-email-pdziepak@scylladb.com>	2016-03-03 10:54:19 +02:00
Asias He	fdfa1df395	locator: Fix get token from a range<token> With a range{t1, t2}, if t2 == {}, the range.end() will contain no value. Fix getting t2 in this case. Fixes #911. Message-Id: <4462e499d706d275c03b116c4645e8aaee7821e1.1456128310.git.asias@scylladb.com>	2016-03-03 10:53:21 +02:00
Tomasz Grabiec	116055cc6f	bytes_ostream: Avoid recursion when freeing chunks When there is a lot of chunks we may get stack overflow. This seems to fix issue #906, a memory corruption during schema merge. I suspect that what causes corruption there is overflowing of the stack allocated for the seastar thread. Those stacks don't have red zones which would catch overflow. Message-Id: <1456056288-3983-1-git-send-email-tgrabiec@scylladb.com>	2016-03-03 10:53:01 +02:00
Calle Wilund	04c19344de	database: Fix use and assumptions about pending compations Fixes #934 - faulty assert in discard_sstables run_with_compaction_disabled clears out a CF from compaction mananger queue. discard_sstables wants to assert on this, but looks at the wrong counters. pending_compactions is an indicator on how much interested parties want a CF compacted (again and again). It should not be considered an indicator of compactions actually being done. This modifies the usage slightly so that: 1.) The counter is always incremented, even if compaction is disallowed. The counters value on end of run_with_compaction_disabled is then instead used as an indicator as to whether a compaction should be re-triggered. (If compactions finished, it will be zero) 2.) Document the use and purpose of the pending counter, and add method to re-add CF to compaction for r_w_c_d above. 3.) discard_sstables now asserts on the right things. Message-Id: <1456332824-23349-1-git-send-email-calle@scylladb.com>	2016-03-03 10:51:27 +02:00
Raphael S. Carvalho	df19e546f9	tests: sstable_test: submit compaction request through column family That's needed for reverted commit `9586793c` to work. It's also the correct thing to do, i.e. column family submits itself to manager. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <2a1d141ad929c1957933f57412083dd52af0390b.1456415398.git.raphaelsc@scylladb.com>	2016-03-03 10:51:23 +02:00
Takuya ASADA	b532919c55	dist: add posix_net_conf.sh on Ubuntu package Fixes #881 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1455522990-32044-1-git-send-email-syuu@scylladb.com> (cherry picked from commit `fb3f4cc148`) scylla-0.18	2016-02-15 17:03:10 +02:00
Takuya ASADA	6ae6dcc2fc	dist: switch AMI base image to 'CentOS7-Base2', uses CentOS official kernel On previous CentOS base image, it accsidently uses non-standard kernel from elrepo. This replaces base image to new one, contains CentOS default kernel. Fixes #890 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1455398903-2865-1-git-send-email-syuu@scylladb.com> (cherry picked from commit `3697cee76d`)	2016-02-15 15:59:04 +02:00
Tomasz Grabiec	5716140a14	abstract_replication_strategy: Fix generation of token ranges We can't move-from in the loop because the subject will be empty in all but the first iteration. Fixes crash during node stratup: "Exiting on unhandled exception of type 'runtime_exception': runtime error: Invalid token. Should have size 8, has size 0" Fixes update_cluster_layout_tests.py:TestUpdateClusterLayout.simple_add_node_1_test (and probably others) Signed-off-by: Tomasz Grabiec <tgrabiec@scylladb.com> (cherry picked from commit `efdbc3d6d7`)	2016-02-14 14:39:31 +02:00
Avi Kivity	91cb9bae2e	release: prepare for 0.18	2016-02-11 17:55:20 +02:00
Shlomi Livne	f938e1d303	dist: start scylla with SCYLLA_IO Signed-off-by: Shlomi Livne <shlomi@scylladb.com> Message-Id: <d93a7b41a285fcde796c5681479a328f1efac0c3.1455188901.git.shlomi@scylladb.com>	2016-02-11 17:01:03 +02:00
Shlomi Livne	5494135ddd	dist: update SCYLLA_IO with params for AMI Add setting of --num-io-queues, --max-io-requests for AMI Signed-off-by: Shlomi Livne <shlomi@scylladb.com> Message-Id: <b94a63154a91c8568e194d7221b9ffc7d7813ebc.1455188901.git.shlomi@scylladb.com>	2016-02-11 17:01:02 +02:00
Shlomi Livne	5cae2560a3	dist: introduce SCYLLA_IO Signed-off-by: Shlomi Livne <shlomi@scylladb.com> Message-Id: <6490d049fd23a335bb0a95cac3e8a4c08c61166e.1455188901.git.shlomi@scylladb.com>	2016-02-11 17:01:02 +02:00
Shlomi Livne	d8cdf76e70	dist: change setting of scylla home from "-d" to "-r" Signed-off-by: Shlomi Livne <shlomi@scylladb.com> Message-Id: <53dcd9d1daa0194de3f889b67788d9c21d1e474d.1455188901.git.shlomi@scylladb.com>	2016-02-11 17:00:37 +02:00
Avi Kivity	3c4f67f3e6	build: require boost > 1.55 See #898. Add checks both for boost being installed, and for the correct version. Message-Id: <1455193574-24959-1-git-send-email-avi@scylladb.com>	2016-02-11 15:15:49 +02:00
Avi Kivity	9249d45ae1	Update scylla-ami submodule * dist/ami/files/scylla-ami b2724be...b3b85be (1): > adding --stop-services	2016-02-11 12:24:17 +02:00
Avi Kivity	5834815ed9	Merge seastar upstream * seastar 14c9991...353b1a1 (2): > scripts: posix_net_conf.sh: Change the way we learn NIC's IRQ numbers > gate: protect against calling close() more than once	2016-02-11 12:23:51 +02:00
Takuya ASADA	09b1ec6103	dist: attach ephemeral disks on AMI by default To attach maximum number of ephemeral disks available on the instance, specify 8. On AMI creation, it will be reduce to available number. Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1454439628-2882-1-git-send-email-syuu@scylladb.com>	2016-02-11 12:21:09 +02:00
Takuya ASADA	16e6db42e1	dist: abandon to start scylla-server when it's disabled from AMI userdata Support AMi's --stop-services, prevent startup scylla-server (and scylla-jmx, since it's dependent on scylla-server) Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1454492729-11876-1-git-send-email-syuu@scylladb.com>	2016-02-11 12:21:08 +02:00
Takuya ASADA	f227b3faac	dist: On AMI, mark root disk with delete_on_termination Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1454513308-12384-1-git-send-email-syuu@scylladb.com>	2016-02-11 12:19:28 +02:00
Takuya ASADA	33309f667e	dist: enable enhanced networking on AMI Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1454971289-21369-1-git-send-email-syuu@scylladb.com>	2016-02-11 12:18:48 +02:00
Raphael S. Carvalho	ed61fe5831	sstables: make compaction stop report user-friendly When scylla stopped an ongoing compaction, the event was reported as an error. This patch introduces a specialized exception for compaction stop so that the event can be handled appropriately. Before: ERROR [shard 0] compaction_manager - compaction failed: read exception: std::runtime_error (Compaction for keyspace1/standard1 was deliberately stopped.) After: INFO [shard 0] compaction_manager - compaction info: Compaction for keyspace1/standard1 was stopped due to shutdown. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <1f85d4e5c24d23a1b4e7e0370a2cffc97cbc6d44.1455034236.git.raphaelsc@scylladb.com>	2016-02-11 12:16:53 +02:00
Takuya ASADA	8d8130f9c9	dist: fix typo on build_ami.sh We should always run scylla_setup, not just for locally built rpm Fixes #897 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1455103519-13780-1-git-send-email-syuu@scylladb.com>	2016-02-11 11:56:11 +02:00
Shlomi Livne	64f8d5a50e	dist: update packer location Signed-off-by: Shlomi Livne <shlomi@scylladb.com> Message-Id: <3c33ea073f702e00b789930fce9befef03ad9e88.1455178900.git.shlomi@scylladb.com>	2016-02-11 11:52:56 +02:00
Avi Kivity	bfbf89ee31	Merge "Serialize keys in a form independent of in-memory representation" from Tomasz "This series changes the on-wire definitions of keys to be of the following form: class partition_key { std::vector<bytes> exploded(); }; Keys are therefore collections of components. The components are serialized according to the format specified in the CQL binary protocol. No bit depends now on how we store keys in memory. Constructing keys from components currently requires a schema reference, which makes it not possible to deserialize or serialize the keys automatically by RPC. To avoid those complications, compound_type was changed so that it can be constructed and components can be iterated over without schema. Because of this, partition_key size increased by 2 bytes."	2016-02-10 17:54:42 +02:00
Tomasz Grabiec	b74301302c	tests: Add test for key serialization	2016-02-10 15:22:56 +01:00
Tomasz Grabiec	3e2c1840d8	idl: Make key definitions independent of in-memory representation	2016-02-10 15:22:56 +01:00
Tomasz Grabiec	428fce3828	compound: Optimize serialize_single()	2016-02-10 15:22:56 +01:00
Tomasz Grabiec	0cc2832a76	keys: Allow constructing from a range	2016-02-10 15:22:56 +01:00
Tomasz Grabiec	3ffcb998fb	keys: Enable serialization from a range not just a vector	2016-02-10 14:35:14 +01:00
Tomasz Grabiec	095efd01d6	keys: Make from_exploded() and components() work without schema For simplicity, we want to have keys serializable and deserializable without schema for now. We will serialize keys in a generic form of a vector of components where the format of components is specified by CQL binary protocol. So conversion between keys and vector of components needs to be possible to do without schema. We may want to make keys schema-dependent back in the future to apply space optimizations specific to column types. Existing code should still pass schema& to construct and access the key when possible. One optimization had to be reverted in this change - avoidance of storing key length (2 bytes) for single-component partition keys. One consequence of this, in addition to a bit larger keys, is that we can no longer avoid copy when constructing single-component partition keys from a ready "bytes" object. I haven't noticed any significant performance difference in: tests/perf/perf_simple_query -c1 --write It does ~130K tps on my machine.	2016-02-10 14:35:13 +01:00

1 2 3 4 5 ...

8493 Commits