scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-25 11:00:35 +00:00

Author	SHA1	Message	Date
Paweł Dziepak	7f17424a4e	Merge "Avoid loosing changes to keyspace parameters of system_auth and tracing keyspaces" form Tomek "If a node is bootstrapped with auto_boostrap disabled, it will not wait for schema sync before creating global keyspaces for auth and tracing. When such schema changes are then reconciled with schema on other nodes, they may overwrite changes made by the user before the node was started, because they will have higher timestamp. To prevent that, let's use minimum timestamp so that default schema always looses with manual modifications. This is what Cassandra does. Fixes #2129." * tag 'tgrabiec/prevent-keyspace-metadata-loss-v1' of github.com:scylladb/seastar-dev: db: Create default auth and tracing keyspaces using lowest timestamp migration_manager: Append actual keyspace mutations with schema notifications (cherry picked from commit `6db6d25f66`)	2017-03-08 16:31:41 +02:00
Nadav Har'El	dd56f1bec7	sstable decompression: fix skip() to end of file The skip() implementation for the compressed file input stream incorrectly handled the case of skipping to the end of file: In that case we just need to update the file pointer, but not skip anywhere in the compressed disk file; In particular, we must NOT call locate() to find the relevant on-disk compressed chunk, because there is none - locate() can only be called on actual positions of bytes, not on the one-past-end-of-file position. Fixes #2143 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20170308100057.23316-1-nyh@scylladb.com> (cherry picked from commit `506e074ba4`)	2017-03-08 12:35:39 +02:00
Pekka Enberg	5df61797d6	release: prepare for 1.7.rc1	2017-03-08 12:25:34 +02:00
Paweł Dziepak	b6db9e3d51	db: make do_apply_counter_update() propagate timeout to db_apply() db_apply() expects to be given a time point at which the request will time out. Originally, do_apply_counter_update() passed 0, which meant that all requests were timed out if do_apply() needed to wait. The caller of do_apply_counter_update() is already given a correct timeout time point so the only thing needed to fix this problem it to propagate it properly inside do_apply_counter_update() to the call to do_apply(). Fixes #2119. Message-Id: <20170307104405.5843-1-pdziepak@scylladb.com>	2017-03-07 12:44:11 +01:00
Gleb Natapov	f2595bea85	memtable: do not open code logalloc::reclaim_lock use logalloc::reclaim_lock prevents reclaim from running which may cause regular allocation to fail although there is enough of free memory. To solve that there is an allocation_section which acquire reclaim_lock and if allocation fails it run reclaimer outside of a lock and retries the allocation. The patch make use of allocation_section instead of direct use of reclaim_lock in memtable code. Fixes #2138. Message-Id: <20170306160050.GC5902@scylladb.com> (cherry picked from commit `d7bdf16a16`)	2017-03-07 11:16:15 +02:00
Gleb Natapov	e930ef0ee0	memtable: do not yield while holding reclaim_lock Holding reclaim_lock while yielding may cause memory allocations to fail. Fixes #2139 Message-Id: <20170306153151.GA5902@scylladb.com> (cherry picked from commit `5c4158daac`)	2017-03-06 18:35:46 +02:00
Takuya ASADA	4cf0f88724	dist/redhat: enables discard on CentOS/RHEL RAID0 Since CentOS/RHEL raid module disables discard by default, we need enable it again to use. Fixes #2033 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1488407037-4795-1-git-send-email-syuu@scylladb.com> (cherry picked from commit `6602221442`)	2017-03-06 12:22:17 +02:00
Avi Kivity	372f07b06e	Update scylla-ami submodule * dist/ami/files/scylla-ami d5a4397...eedd12f (3): > Rewrite disk discovery to handle EBS and NVMEs. > add --developer-mode option > trivial cleanup: replace tab in indent	2017-03-04 13:31:08 +02:00
Tomasz Grabiec	0ccc6630a8	db: Fix overflow of gc_clock time point If query_time is time_point::min(), which is used by to_data_query_result(), the result of subtraction of gc_grace_seconds() from query_time will overflow. I don't think this bug would currently have user-perceivable effects. This affects which tombstones are dropped, but in case of to_data_query_result() uses, tombstones are not present in the final data query result, and mutation_partition::do_compact() takes tombstones into consideration while compacting before expiring them. Fixes the following UBSAN report: /usr/include/c++/5.3.1/chrono:399:55: runtime error: signed integer overflow: -2147483648 - 604800 cannot be represented in type 'int' Message-Id: <1488385429-14276-1-git-send-email-tgrabiec@scylladb.com> (cherry picked from commit `4b6e77e97e`)	2017-03-01 18:50:19 +02:00
Takuya ASADA	b95a2338be	dist/debian/dep: fix broken link of gcc-5, update it to 5.4.1-5 Since gcc-5/stretch=5.4.1-2 removed from apt repository, we nolonger able to build gcc-5. To avoid dead link, use launchpad.net archives instead of using apt-get source. Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1488189378-5607-1-git-send-email-syuu@scylladb.com> (cherry picked from commit `ba323e2074`)	2017-03-01 17:13:42 +02:00
Tomasz Grabiec	f2d0ac9994	query: Fix invalid initialization of _memory_tracker by moving-from-self Fixes the following UBSAN warning: core/semaphore.hh:293:74: runtime error: reference binding to misaligned address 0x0000006c55d7 for type 'struct basic_semaphore', which requires 8 byte alignment Since the field was not initialied properly, probably also fixes some user-visible bug. Message-Id: <1488368222-32009-1-git-send-email-tgrabiec@scylladb.com> (cherry picked from commit `0c84f00b16`)	2017-03-01 11:56:49 +00:00
Gleb Natapov	56725de0db	sstable: close sstable_writer's file if writing of sstable fails. Failing to close a file properly before destroying file's object causes crashes. [tgrabiec: fixed typo] Fixes #2122. Message-Id: <20170221144858.GG11471@scylladb.com> (cherry picked from commit `0977f4fdf8`)	2017-02-28 11:04:26 +02:00
Avi Kivity	6f479c8999	Update seastar submodule * seastar b14373b...f391f9e (1): > fix append_challenged_posix_file_impl::process_queue() to handle recursion Fixes #2121.	2017-02-28 10:55:54 +02:00
Calle Wilund	8c0488bce9	messaging_service: Move log printout to actual listen start Fixes #1845 Log printout was before we actually had evaluated endpoint to create, thus never included SSL info. Message-Id: <1487766738-27797-1-git-send-email-calle@scylladb.com> (cherry picked from commit `d5f57bd047`)	2017-02-23 13:18:33 +02:00
Avi Kivity	68dd11e275	config: enable new sharding algorithm for new deployments Set murmur3_partitioner_ignore_msb_bits to 12 (enabling the new sharding algorithm), but do this in scylla.yaml rather than the built-in defaults. This avoids changing the configuration for existing clusters, as their scylla.yaml file will not be updated during the upgrade. Message-Id: <20170214123253.3933-1-avi@scylladb.com> (cherry picked from commit `9b113ffd3e`)	2017-02-22 11:23:46 +01:00
Tomasz Grabiec	a64c53d05f	Update seastar submodule * seastar fc27cec...b14373b (1): > reactor utilization should return the utilization in 0-1 range	2017-02-22 09:38:17 +01:00
Paweł Dziepak	42e7a59cca	tests/cql_test_env: wait for storage service initialization Message-Id: <20170221121130.14064-1-pdziepak@scylladb.com> (cherry picked from commit `274bcd415a`)	2017-02-21 17:06:10 +02:00
Avi Kivity	2cd019ee47	Merge "Fixes for counter cell locking" from Paweł "This series contains some fixes and a unit test for the logic responsible for locking counter cells." * 'pdziepak/cell-locking-fixes/v1' of github.com:cloudius-systems/seastar-dev: tests: add test for counter cell locker cell_locking: fix schema upgrades cell_locker: make locker non-movable cell_locking: allow to be included by anyone (cherry picked from commit `b8c4b35b57`)	2017-02-15 17:37:38 +02:00
Takuya ASADA	bc8b553bec	dist/redhat: stop backporting ninja-build from Fedora, install it from EPEL instead ninja-build-1.6.0-2.fc23.src.rpm on fedora web site deleted for some reason, but there is ninja-build-1.7.2-2 on EPEL, so we don't need to backport from Fedora anymore. Fixes #2087 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1487155729-13257-1-git-send-email-syuu@scylladb.com> (cherry picked from commit `9c8515eeed`)	2017-02-15 12:58:44 +02:00
Avi Kivity	0ba98be899	Update seastar submodule * seastar bff963a...fc27cec (1): > collectd: send double correctly for gauge	2017-02-14 16:09:22 +02:00
Avi Kivity	d6899134a7	Update seastar submodule * seastar f07f8ed...bff963a (1): > prometheus: send one MetricFamily per unique metric name	2017-02-13 11:50:43 +02:00
Avi Kivity	5253031110	seastar: point submodule at scylla-seastar.git Allows backporting seastar patches independently of master.	2017-02-13 11:49:54 +02:00
Avi Kivity	a203c87f0d	Merge "Disallow mixed schemas" fro Paweł "This series makes sure that schemas containing both counter and non-counter regular or static columns are not allowed." * 'pdziepak/disallow-mixed-schemas/v1' of github.com:cloudius-systems/seastar-dev: schema: verify that there are no both counter and non-counter columns test/mutation_source: specify whether to generate counter mutations tests/canonical_mutation: don't try to upgrade incompatible schemas (cherry picked from commit `9e4ae0763d`)	2017-02-07 18:04:24 +02:00
Gleb Natapov	37fc0e6840	storage_proxy: use storage_proxy clock instead of explicit lowres_clock Merge commit `45b6070832` used butchered version of storage_proxy patch to adjust to rpc timer change instead the one I've sent. This patch fixes the differences. Message-Id: <20170206095237.GA7691@scylladb.com> (cherry picked from commit `3c372525ed`)	2017-02-06 12:51:52 +02:00
Avi Kivity	0429e5d8ea	cell_locking: work around for missing boost::container::small_vector small_vector doesn't exist on Ubuntu 14.04's boost, use std::vector instead. (cherry picked from commit `6e9e28d5a3`)	2017-02-05 20:49:43 +02:00
Avi Kivity	3c147437ac	dist: add build dependency on automake Needed by seastar's c-ares. (cherry picked from commit `2510b756fc`)	2017-02-05 20:17:27 +02:00
Takuya ASADA	e4b3f02286	dist/common/systemd: introduce scylla-housekeeping restart mode scylla-housekeeping requires to run 'restart mode' for check the version during scylla-server restart, which wasn't called on systemd timer so added it. Existing scylla-housekeeping.timer renamed to scylla-housekeeping-daily.timer, since it is running 'daily mode'. Fixes #1953 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Message-Id: <1486180031-18093-1-git-send-email-syuu@scylladb.com> (cherry picked from commit `e82932b774`)	2017-02-05 11:28:03 +02:00
Avi Kivity	5a8013e155	dist: add libtool build dependency for seastar/c-ares (cherry picked from commit `4175f40da1`)	2017-02-05 11:27:38 +02:00
Pekka Enberg	fdba5b8eac	release: prepare for 1.7.rc0	2017-02-04 11:04:32 +02:00
Paweł Dziepak	558a52802a	cell_locking: fix parititon_entry::equal_compare The comparator constructor took schema by value instead of const l-ref and, consequently, later tried to access object that has been destroyed long time ago. Message-Id: <20170202135853.8190-1-pdziepak@scylladb.com> (cherry picked from commit `37b0c71f1d`)	2017-02-03 21:28:42 +02:00
Avi Kivity	4f416c7272	Merge "Avoid avalanche of tasks after memtable flush" from Tomasz "Before, the logic for releasing writes blocked on dirty worked like this: 1) When region group size changes and it is not under pressure and there are some requests blocked, then schedule request releasing task 2) request releasing task, if no pressure, runs one request and if there are still blocked requests, schedules next request releasing task If requests don't change the size of the region group, then either some request executes or there is a request releasing task scheduled. The amount of scheduled tasks is at most 1, there is a single releasing thread. However, if requests themselves would change the size of the group, then each such change would schedule yet another request releasing thread, growing the task queue size by one. The group size can also change when memory is reclaimed from the groups (e.g. when contains sparse segments). Compaction may start many request releasing threads due to group size updates. Such behavior is detrimental for performance and stability if there are a lot of blocked requests. This can happen on 1.5 even with modest concurrency because timed out requests stay in the queue. This is less likely on 1.6 where they are dropped from the queue. The releasing of tasks may start to dominate over other processes in the system. When the amount of scheduled tasks reaches 1000, polling stops and server becomes unresponsive until all of the released requests are done, which is either when they start to block on dirty memory again or run out of blocked requests. It may take a while to reach pressure condition after memtable flush if it brings virtual dirty much below the threshold, which is currently the case for workloads with overwrites producing sparse regions. I saw this happening in a write workload from issue #2021 where the number of request releasing threads grew into thousands. Fix by ensuring there is at most one request releasing thread at a time. There will be one releasing fiber per region group which is woken up when pressure is lifted. It executes blocked requests until pressure occurs." * tag 'tgrabiec/lsa-single-threaded-releasing-v2' of github.com:cloudius-systems/seastar-dev: tests: lsa: Add test for reclaimer starting and stopping tests: lsa: Add request releasing stress test lsa: Avoid avalanche releasing of requests lsa: Move definitions to .cc lsa: Simplify hard pressure notification management lsa: Do not start or stop reclaiming on hard pressure tests: lsa: Adjust to take into account that reclaimers are run synchronously lsa: Document and annotate reclaimer notification callbacks tests: lsa: Use with_timeout() in quiesce() (cherry picked from commit `7a00dd6985`)	2017-02-03 09:47:50 +01:00
Paweł Dziepak	788892e931	counters: fix build failure on gcc5 Message-Id: <20170202132049.4497-1-pdziepak@scylladb.com>	2017-02-02 14:23:49 +01:00
Piotr Jastrzebski	36b2c4df19	row_cache_test: extend test_mvcc Make the test execute with and without an active reader to memtable that's flushed to cache. This improves the code covarage of MVCC with tests. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <007b6cd1ba7a84ea5675ea82e454bf1adf3b3330.1485954941.git.piotr@scylladb.com>	2017-02-02 13:51:32 +01:00
Tomasz Grabiec	5458a32f13	gdb: Introduce commands for inspecting pending task queue Message-Id: <1485426236-6627-1-git-send-email-tgrabiec@scylladb.com>	2017-02-02 13:15:17 +02:00
Avi Kivity	000edc36c4	Merge "Counters" from Paweł "This series introduces support for counters. The implementation of counters more or less follows the design described on our wiki page [1]. Counter cells contain many shards with replicas being able to modify and announce new versions only of the shards that they own. Historically, there were three types of shards: local, remote and global. In these patches only support for the global ones is added. [1] https://github.com/scylladb/scylla/wiki/Counters Currently, counters are only enabled as experimental features as there still several things that need to be done before they become production ready. Namely, the performance is expected to be quite poor (especially for writes), there is no proper tracing support and timed out counter requests may not be recognized and dropped early. There are also no counter-related metrics. However, apart from these problems there are no other missing parts of counter implementation and they are expected to work correctly. Fixes #577." * 'pdziepak/counters/v3-rebased' of github.com:cloudius-systems/seastar-dev: (38 commits) perf_simple_query: add counter tables tests thrift: add support for counter operations cql3: allow counters in CREATE TABLE statements cql3: selection: do not panic when seeing counters storage_proxy: support counter updates storage_proxy: add get_live_endpoints() cql3: add counter increment and decrement operations db: add operations for applying counter updates counters: implement transforming counter deltas to shards add infrastructure for locking counter cells add fnv1a hasher position_in_partition: add feed_hash() position_in_partition: add functions for querying object type types: make counter_type_impl report its cql3_type transport: encode counters as long_type mutation_partition: make for_each_cell() accessible outside source file messaging_service: add COUNTER_MUTATION verb storage_service: add COUNTERS feature idl: add idl description of consistency level schema: make is_counter() return correct value ...	2017-02-02 12:40:09 +02:00
Paweł Dziepak	8671d8329d	perf_simple_query: add counter tables tests	2017-02-02 10:35:14 +00:00
Paweł Dziepak	4ca7f0a491	thrift: add support for counter operations	2017-02-02 10:35:14 +00:00
Paweł Dziepak	fa29ef3cc0	cql3: allow counters in CREATE TABLE statements	2017-02-02 10:35:14 +00:00
Paweł Dziepak	fce6e0987f	cql3: selection: do not panic when seeing counters At this stage counters cells are already long_type values, so no special handling is necessary.	2017-02-02 10:35:14 +00:00
Paweł Dziepak	1e8814f5ce	storage_proxy: support counter updates	2017-02-02 10:35:14 +00:00
Paweł Dziepak	c14c6b753b	storage_proxy: add get_live_endpoints()	2017-02-02 10:35:14 +00:00
Paweł Dziepak	d6ebf84edf	cql3: add counter increment and decrement operations	2017-02-02 10:35:14 +00:00
Paweł Dziepak	5a0955e89d	db: add operations for applying counter updates	2017-02-02 10:35:14 +00:00
Paweł Dziepak	8d889082bf	counters: implement transforming counter deltas to shards The leader receives counter updates as deltas which have to be transformed to counter shards. In order to do that, current local shard of the modified counter cell needs to be read, logical clock incremented and the value modified by the specified delta.	2017-02-02 10:35:14 +00:00
Paweł Dziepak	55277b3182	add infrastructure for locking counter cells The leader receives counter update in a form of deltas which need to be transformed to counter shards. In order to do that the node needs to read its current state of the modified counter cells. Since this is essentially a read-modify-write opertation an appropriate locking mechanism is needed. Counter cell locker introduced in this patch uses a hashtable of partition entry each containing a hashtable of cell entries. Inside a cell entry there is a semaphore used for synchronization. Once no longer needed cell entries and partition entries are removed. In order to avoid deadlocks cell entries are always locked in the same order which is the lexicographical order of (clustering key, column id) pairs. Note that schema changes are not a difficulty since they do not make it possible to change ordering of such pairs.	2017-02-02 10:35:14 +00:00
Paweł Dziepak	22fbb11f90	add fnv1a hasher	2017-02-02 10:35:14 +00:00
Paweł Dziepak	a16761dcb4	position_in_partition: add feed_hash()	2017-02-02 10:35:14 +00:00
Paweł Dziepak	f4fce93807	position_in_partition: add functions for querying object type	2017-02-02 10:35:14 +00:00
Paweł Dziepak	53d9a6f220	types: make counter_type_impl report its cql3_type	2017-02-02 10:35:14 +00:00
Paweł Dziepak	a805bea97a	transport: encode counters as long_type For the purposes of CQL counters are long values (either a delta in case of writes or the final value for reads).	2017-02-02 10:35:14 +00:00

1 2 3 4 5 ...

11293 Commits