scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-09 16:33:35 +00:00

Author	SHA1	Message	Date
Glauber Costa	2a7aa1f0d8	sstables: avoid asserts It's great to have statistics, but assert is too big of a hammer. We don't need to crash due to the lack of it, and can try our best to continue. We currently have a problem (described in 265), in which we, for some reason, fail to read the Statistics file. Throwing an exception will still cause us to fail to boot, but at least it will be more informative. Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>	2015-09-08 10:06:05 +03:00
Avi Kivity	e3e13878d1	Merge "Fix storage_service and gossip API" from Asias	2015-09-08 10:05:16 +03:00
Avi Kivity	6d0a2b5075	logalloc: don't invalidate merged region A region being merged can still be in use; but after merging, compaction_lock and the reclaim counter will no longer work. This can lead to use-after-compact-without-re-lookup errors. Fix by making the source region be the same as the target region; they will share compaction locks and reclaim counters, so lookup avoidance will still work correctly. Fixes #286.	2015-09-08 08:55:44 +02:00
Asias He	89f2959536	gossip: Rework stop() and shutdown() Consolidate stop() and shutdown() into one function. Fix crash: scylla: urchin/seastar/core/future.hh:315: void future_state<>::set(): Assertion `_u.st == state::future' failed. === stop gossip $ curl -X DELETE --header "Accept: application/json" "http://127.0.0.1:10000/storage_service/gossiping" === start gossip $ curl -X POST --header "Content-Type: application/json" --header "Accept: application/json" "http://127.0.0.1:10000/storage_service/gossiping"	2015-09-08 12:20:53 +08:00
Asias He	247e9109d9	gossip: Introduce uninit_messaging_service_handler It is useful in gossip shutdown process.	2015-09-08 12:19:06 +08:00
Asias He	312daed342	storage_service: Fix is_starting API Query _operation_mode on CPU 0. $ curl -X GET --header "Accept: application/json" "http://127.0.0.1:10000/storage_service/is_starting"	2015-09-08 11:07:13 +08:00
Asias He	5e3d8a56b2	storage_service: Fix get_operation_mode API Route request to CPU 0. _operation_mode is not replicated to other CPUS. Without this: $ curl -X GET --header "Accept: application/json" "http://127.0.0.1:10000/storage_service/operation_mode" returns "NORMAL" and "STARTING" randomly.	2015-09-08 10:55:50 +08:00
Asias He	0d88570286	storage_service: Fix is_gossip_running API and friends Only cpu 0 instance of gossip has the correct information, route request to cpu 0. Fix a bug where $ curl -X GET --header "Accept: application/json" "http://172.31.5.77:10000/storage_service/gossiping" returns true and false randomly.	2015-09-08 10:45:25 +08:00
Calle Wilund	d614143f5e	Commitlog/database: Fixup series "Commit log flush request on disk overflow" Also at seastar-dev: calle/commitlog_flush_v3 (And, yes, this time I _did_ update the remote!) Refs #262 Commit of original series was done on stale version (v2) due to authors inability to multitask and update git repos. v3: * Removed future<> return value from callbacks. I.e. flush callback is now only fully syncronous over actual call	2015-09-07 21:29:19 +03:00
Gleb Natapov	0149a22f69	storage_proxy: use parallel_for_each in mutate() instead of semaphore If several mutation in a batch throw exceptions have_cl.broken() will be called more then once. Fix this by dropping ad hoc have_cl and use parallel_for_each() that does the same thing that current code is doing. Fixes #297	2015-09-07 19:29:34 +03:00
Tomasz Grabiec	52828c2e84	test.py: Do not run release-mode only tests if release mode not selected	2015-09-07 19:27:33 +03:00
Avi Kivity	dee9060b12	Merge "Commit log flush request on disk overflow" from Calle "Fixes #262 Handles CL disk size exceeding configured max size by calling flush handlers for each dirty CF id / high replay_position mark. (Instead of uncontrolled delete as previously). * Increased default max disk size to 8GB. Same as Origin/scylla.yaml (so no real change, but synced). * Divide the max disk size by cpus (so sum of all shards == max) * Abstract flush callbacks in CL * Handler in DB that initiates memtable->sstable writes when called. Note that the flush request is done "syncronously" in new_segment() (i.e. when getting a new segment and crossing threshold). This is however more or less congruent with Origin, which will do a request-sync in the corresponding case. Actual dealing with the request should at least in production code however be done async, and in DB it is, i.e. we initiate sstable writes. Hopefully they finish soon, and CL segments will be released (before next segment is allocated). If the flush request does _not_ eventually result in any CF:s becoming clean and segments released we could potentially be issuing flushes repeatedly, but never more often than on every new segment."	2015-09-07 18:46:48 +03:00
Tomasz Grabiec	fecc87e601	lsa: stub allocation_section with default allocator memory::stats() always returns 0 as free memory which confuses guard::enter().	2015-09-07 17:23:02 +02:00
Gleb Natapov	da242146b6	do not pass storage_proxy reference across cpus storage_proxy instances are per cpu, so they cannot be passed around to other cpus.	2015-09-07 17:16:29 +02:00
Paweł Dziepak	03f5827570	logalloc: add missing methods to DEFAULT_ALLOCATOR version Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>	2015-09-07 16:59:27 +02:00
Paweł Dziepak	ac602b13b5	tests: fix signed/unsigned comparison Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>	2015-09-07 16:41:00 +02:00
Avi Kivity	e37dfab853	Merge "Stability improvements" from Tomasz "Fixes #259 and other problems found along the way."	2015-09-07 16:45:44 +03:00
Gleb Natapov	327e27b67b	storage_proxy: stop query timeout timer when all replies are received Fixes #285	2015-09-07 15:21:51 +02:00
Gleb Natapov	41f16159b3	storage_proxy: track reference to storage_proxy during mutate/query operations This patch makes sure that storage_proxy cannot be deleted while mutate/query operation is in progress.	2015-09-07 14:46:13 +02:00
Gleb Natapov	f51f5c819e	messaging: Add unregister function for verbs used by storage proxy	2015-09-07 14:46:13 +02:00
Gleb Natapov	b884aba147	storage_proxy: drop superfluous captures	2015-09-07 14:46:13 +02:00
Gleb Natapov	5af2d18b6f	stroage_proxy: change storage_proxy::mutate_locally to use do_with	2015-09-07 14:46:13 +02:00
Takuya ASADA	8fa868d4e9	dist: use mock rpm instead of rpmbuild Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>	2015-09-07 15:41:13 +03:00
Takuya ASADA	45502b7110	dist: Dynamically configure scylla.yaml on EC2 instance Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>	2015-09-07 15:41:13 +03:00
Takuya ASADA	a5dcc39494	dist: fix scylla_run to specify '--network-mode posix' when NETWORK_MODE is posix Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>	2015-09-07 15:40:08 +03:00
Tomasz Grabiec	1f3d7aa78c	Bump up seastar submodule head Changes: tests: improve rpc timeout test rpc: add unregister_handler future: Fix assertion failure in case schedule() throws rpc: fix rpc timeout build: disable -fsanitize=vptr if it is available and broken	2015-09-07 14:04:53 +02:00
Calle Wilund	380649eb66	Database: Add commitlog flush handler to switch memtables to disk Initiates flushing of CF:s to sstable on CL disk overflow (flush req)	2015-09-07 13:21:46 +02:00
Calle Wilund	fdb921afb2	Commitlog: Add flushing of segment CF:s on disk overflow * Do not throw away commitlog segments on disk size overflow. Issue a flush request (i.e. calculate RP we want to free unto, and for all dirty CF:s, do a request). "Abstracted" as registerable callback. I.e. DB:s responsibility to actually do something with it.	2015-09-07 13:21:43 +02:00
Calle Wilund	31f2dcb342	Config: change commilog max size on disk to be in sync with scylla.yaml	2015-09-07 13:13:51 +02:00
Calle Wilund	841dd32a8a	Commitlog: divide max on-disk-size by num cpus To try to keep the resulting limit as configured	2015-09-07 13:13:46 +02:00
Asias He	f89a25562c	storage_service: Fix is_auto_bootstrap Get the value from cfg option.	2015-09-07 12:53:58 +03:00
Raphael S. Carvalho	1157e6f119	sstable: use desired buffer size in write_simple Currently, we use a 128k buffer for creation of data and index files, but for other components we use a 4k buffer size. Let's also use a 128k buffer for the other components. Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>	2015-09-07 12:52:26 +03:00
Tomasz Grabiec	433a298f60	row_cache: Extract comparator construction before the loop	2015-09-07 09:41:36 +02:00
Tomasz Grabiec	bf6062493e	tests: Introduce tests/perf_row_cache_update	2015-09-07 09:41:36 +02:00
Tomasz Grabiec	10453c71d2	tests: perf: Make iterations between clock readings in time_it() configurable	2015-09-07 09:41:36 +02:00
Tomasz Grabiec	74603425ac	mutation_partition: Introduce r-value version of apply()	2015-09-07 09:41:36 +02:00
Asias He	7cc768a864	gossip: Fix wrong cluster name and partitioner name Right now, gossip returns hard coded cluster and partitioner name. sstring get_cluster_name() { // FIXME: DatabaseDescriptor.getClusterName() return "my_cluster_name"; } sstring get_partitioner_name() { // FIXME: DatabaseDescriptor.getPartitionerName() return "my_partitioner_name"; } Fix it by setting the correct name from configure option. With this cqlsh 127.0.0.$i -e "SELECT * from system.local; returns correct cluster_name. Fixes #291	2015-09-07 09:21:18 +03:00
Tomasz Grabiec	8a140a9ba9	mutation_partition: row: Move implementation to source file	2015-09-06 21:25:44 +02:00
Tomasz Grabiec	bf2b64d3f7	mutation_partition: row: Fix operator==() Spotted by clion inspections.	2015-09-06 21:25:44 +02:00
Tomasz Grabiec	ba35788817	mutation_partition: De-templetize methods Instead of accepting a column resolver callable, accept a schema and column_kind or column_selector. Makes the interface easier to use and enables us to move implementation to .cc file.	2015-09-06 21:25:44 +02:00
Tomasz Grabiec	49bf844418	tests: Introduce row_cache_alloc_stress Tests stability of row_cache operations under low/fragmented memory.	2015-09-06 21:25:44 +02:00
Tomasz Grabiec	3b441416fa	lsa: Make segment size publicly accessible Some tests depend on segment size.	2015-09-06 21:25:44 +02:00
Tomasz Grabiec	49f094ad5f	tests: Add test for row_cache::update()	2015-09-06 21:25:44 +02:00
Tomasz Grabiec	122bd8ea46	row_cache: Restore indentation	2015-09-06 21:25:44 +02:00
Tomasz Grabiec	d1f89b4eab	row_cache: Use allocation_section See #259. When transferring mutations between memtable and cache, lsa sometimes runs out of memory. This solves the first two points, keeping reserve filled up and adjusting the amount of reserve based on execution history.	2015-09-06 21:25:44 +02:00
Tomasz Grabiec	7efcde12aa	row_cache: Introduce row_cache::touch() Useful in tests for ensuring that certain entries survive eviction.	2015-09-06 21:25:44 +02:00
Tomasz Grabiec	24a5221280	row_cache: Avoid leaking of partitions when exception is thrown inside update()	2015-09-06 21:25:44 +02:00
Tomasz Grabiec	1d182903cd	mutation_partition: Document exception guarantees of apply()	2015-09-06 21:25:44 +02:00
Tomasz Grabiec	c82325a76c	lsa: Make region evictor signal forward progress In some cases region may be in a state where it is not empty and nothing could be evicted from it. For example when creating the first entry, reclaimer may get invoked during creation before it gets linked. We therefore can't rely on emptiness as a stop condition for reclamation, the evction function shall signal us if it made forward progress.	2015-09-06 21:25:44 +02:00
Tomasz Grabiec	94f0db933f	lsa: Fix typo in the word 'emergency'	2015-09-06 21:24:59 +02:00

1 2 3 4 5 ...

6315 Commits