Commit Graph

6310 Commits

Author SHA1 Message Date
Asias He
312daed342 storage_service: Fix is_starting API
Query _operation_mode on CPU 0.

$ curl -X GET --header "Accept: application/json"
"http://127.0.0.1:10000/storage_service/is_starting"
2015-09-08 11:07:13 +08:00
Asias He
5e3d8a56b2 storage_service: Fix get_operation_mode API
Route request to CPU 0. _operation_mode is not replicated to other CPUS.

Without this:

$ curl -X GET --header "Accept: application/json"
"http://127.0.0.1:10000/storage_service/operation_mode"

returns "NORMAL" and "STARTING" randomly.
2015-09-08 10:55:50 +08:00
Asias He
0d88570286 storage_service: Fix is_gossip_running API and friends
Only cpu 0 instance of gossip has the correct information, route request
to cpu 0.

Fix a bug where

$ curl -X GET --header "Accept: application/json"
 "http://172.31.5.77:10000/storage_service/gossiping"

returns true and false randomly.
2015-09-08 10:45:25 +08:00
Calle Wilund
d614143f5e Commitlog/database: Fixup series "Commit log flush request on disk overflow"
Also at seastar-dev: calle/commitlog_flush_v3
(And, yes, this time I _did_ update the remote!)

Refs #262

Commit of original series was done on stale version (v2) due to authors
inability to multitask and update git repos.

v3:
* Removed future<> return value from callbacks. I.e. flush callback is now
  only fully syncronous over actual call
2015-09-07 21:29:19 +03:00
Gleb Natapov
0149a22f69 storage_proxy: use parallel_for_each in mutate() instead of semaphore
If several mutation in a batch throw exceptions have_cl.broken() will be
called more then once. Fix this by dropping ad hoc have_cl and use
parallel_for_each() that does the same thing that current code is doing.

Fixes #297
2015-09-07 19:29:34 +03:00
Tomasz Grabiec
52828c2e84 test.py: Do not run release-mode only tests if release mode not selected 2015-09-07 19:27:33 +03:00
Avi Kivity
dee9060b12 Merge "Commit log flush request on disk overflow" from Calle
"Fixes #262

Handles CL disk size exceeding configured max size by calling flush handlers
for each dirty CF id / high replay_position mark. (Instead of uncontrolled
delete as previously).

* Increased default max disk size to 8GB. Same as Origin/scylla.yaml (so no
   real change, but synced).
* Divide the max disk size by cpus (so sum of all shards == max)
* Abstract flush callbacks in CL
* Handler in DB that initiates memtable->sstable writes when called.

Note that the flush request is done "syncronously" in new_segment() (i.e.
when getting a new segment and crossing threshold). This is however more or
less congruent with Origin, which will do a request-sync in the corresponding
case.
Actual dealing with the request should at least in production code however be
done async, and in DB it is, i.e. we initiate sstable writes. Hopefully
they finish soon, and CL segments will be released (before next segment is
allocated).

If the flush request does _not_ eventually result in any CF:s becoming
clean and segments released we could potentially be issuing flushes
repeatedly, but never more often than on every new segment."
2015-09-07 18:46:48 +03:00
Tomasz Grabiec
fecc87e601 lsa: stub allocation_section with default allocator
memory::stats() always returns 0 as free memory which confuses
guard::enter().
2015-09-07 17:23:02 +02:00
Gleb Natapov
da242146b6 do not pass storage_proxy reference across cpus
storage_proxy instances are per cpu, so they cannot be passed around to
other cpus.
2015-09-07 17:16:29 +02:00
Paweł Dziepak
03f5827570 logalloc: add missing methods to DEFAULT_ALLOCATOR version
Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
2015-09-07 16:59:27 +02:00
Paweł Dziepak
ac602b13b5 tests: fix signed/unsigned comparison
Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
2015-09-07 16:41:00 +02:00
Avi Kivity
e37dfab853 Merge "Stability improvements" from Tomasz
"Fixes #259 and other problems found along the way."
2015-09-07 16:45:44 +03:00
Gleb Natapov
327e27b67b storage_proxy: stop query timeout timer when all replies are received
Fixes #285
2015-09-07 15:21:51 +02:00
Gleb Natapov
41f16159b3 storage_proxy: track reference to storage_proxy during mutate/query operations
This patch makes sure that storage_proxy cannot be deleted while
mutate/query operation is in progress.
2015-09-07 14:46:13 +02:00
Gleb Natapov
f51f5c819e messaging: Add unregister function for verbs used by storage proxy 2015-09-07 14:46:13 +02:00
Gleb Natapov
b884aba147 storage_proxy: drop superfluous captures 2015-09-07 14:46:13 +02:00
Gleb Natapov
5af2d18b6f stroage_proxy: change storage_proxy::mutate_locally to use do_with 2015-09-07 14:46:13 +02:00
Takuya ASADA
8fa868d4e9 dist: use mock rpm instead of rpmbuild
Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
2015-09-07 15:41:13 +03:00
Takuya ASADA
45502b7110 dist: Dynamically configure scylla.yaml on EC2 instance
Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
2015-09-07 15:41:13 +03:00
Takuya ASADA
a5dcc39494 dist: fix scylla_run to specify '--network-mode posix' when NETWORK_MODE is posix
Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
2015-09-07 15:40:08 +03:00
Tomasz Grabiec
1f3d7aa78c Bump up seastar submodule head
Changes:

    tests: improve rpc timeout test
    rpc: add unregister_handler
    future: Fix assertion failure in case schedule() throws
    rpc: fix rpc timeout
    build: disable -fsanitize=vptr if it is available and broken
2015-09-07 14:04:53 +02:00
Calle Wilund
380649eb66 Database: Add commitlog flush handler to switch memtables to disk
Initiates flushing of CF:s to sstable on CL disk overflow (flush req)
2015-09-07 13:21:46 +02:00
Calle Wilund
fdb921afb2 Commitlog: Add flushing of segment CF:s on disk overflow
* Do not throw away commitlog segments on disk size overflow. 
  Issue a flush request (i.e. calculate RP we want to free unto, 
  and for all dirty CF:s, do a request).
  "Abstracted" as registerable callback. I.e. DB:s responsibility 
  to actually do something with it.
2015-09-07 13:21:43 +02:00
Calle Wilund
31f2dcb342 Config: change commilog max size on disk to be in sync with scylla.yaml 2015-09-07 13:13:51 +02:00
Calle Wilund
841dd32a8a Commitlog: divide max on-disk-size by num cpus
To try to keep the resulting limit as configured
2015-09-07 13:13:46 +02:00
Asias He
f89a25562c storage_service: Fix is_auto_bootstrap
Get the value from cfg option.
2015-09-07 12:53:58 +03:00
Raphael S. Carvalho
1157e6f119 sstable: use desired buffer size in write_simple
Currently, we use a 128k buffer for creation of data and index
files, but for other components we use a 4k buffer size.
Let's also use a 128k buffer for the other components.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-09-07 12:52:26 +03:00
Tomasz Grabiec
433a298f60 row_cache: Extract comparator construction before the loop 2015-09-07 09:41:36 +02:00
Tomasz Grabiec
bf6062493e tests: Introduce tests/perf_row_cache_update 2015-09-07 09:41:36 +02:00
Tomasz Grabiec
10453c71d2 tests: perf: Make iterations between clock readings in time_it() configurable 2015-09-07 09:41:36 +02:00
Tomasz Grabiec
74603425ac mutation_partition: Introduce r-value version of apply() 2015-09-07 09:41:36 +02:00
Asias He
7cc768a864 gossip: Fix wrong cluster name and partitioner name
Right now, gossip returns hard coded cluster and partitioner name.

  sstring get_cluster_name() {
      // FIXME: DatabaseDescriptor.getClusterName()
      return "my_cluster_name";
  }
  sstring get_partitioner_name() {
      // FIXME: DatabaseDescriptor.getPartitionerName()
      return "my_partitioner_name";
  }

Fix it by setting the correct name from configure option.

With this

   cqlsh 127.0.0.$i -e "SELECT * from system.local;

returns correct cluster_name.

Fixes #291
2015-09-07 09:21:18 +03:00
Tomasz Grabiec
8a140a9ba9 mutation_partition: row: Move implementation to source file 2015-09-06 21:25:44 +02:00
Tomasz Grabiec
bf2b64d3f7 mutation_partition: row: Fix operator==()
Spotted by clion inspections.
2015-09-06 21:25:44 +02:00
Tomasz Grabiec
ba35788817 mutation_partition: De-templetize methods
Instead of accepting a column resolver callable, accept a schema and
column_kind or column_selector. Makes the interface easier to use and
enables us to move implementation to .cc file.
2015-09-06 21:25:44 +02:00
Tomasz Grabiec
49bf844418 tests: Introduce row_cache_alloc_stress
Tests stability of row_cache operations under low/fragmented memory.
2015-09-06 21:25:44 +02:00
Tomasz Grabiec
3b441416fa lsa: Make segment size publicly accessible
Some tests depend on segment size.
2015-09-06 21:25:44 +02:00
Tomasz Grabiec
49f094ad5f tests: Add test for row_cache::update() 2015-09-06 21:25:44 +02:00
Tomasz Grabiec
122bd8ea46 row_cache: Restore indentation 2015-09-06 21:25:44 +02:00
Tomasz Grabiec
d1f89b4eab row_cache: Use allocation_section
See #259.

When transferring mutations between memtable and cache, lsa sometimes
runs out of memory. This solves the first two points, keeping reserve
filled up and adjusting the amount of reserve based on execution
history.
2015-09-06 21:25:44 +02:00
Tomasz Grabiec
7efcde12aa row_cache: Introduce row_cache::touch()
Useful in tests for ensuring that certain entries survive eviction.
2015-09-06 21:25:44 +02:00
Tomasz Grabiec
24a5221280 row_cache: Avoid leaking of partitions when exception is thrown inside update() 2015-09-06 21:25:44 +02:00
Tomasz Grabiec
1d182903cd mutation_partition: Document exception guarantees of apply() 2015-09-06 21:25:44 +02:00
Tomasz Grabiec
c82325a76c lsa: Make region evictor signal forward progress
In some cases region may be in a state where it is not empty and
nothing could be evicted from it. For example when creating the first
entry, reclaimer may get invoked during creation before it gets
linked. We therefore can't rely on emptiness as a stop condition for
reclamation, the evction function shall signal us if it made forward
progress.
2015-09-06 21:25:44 +02:00
Tomasz Grabiec
94f0db933f lsa: Fix typo in the word 'emergency' 2015-09-06 21:24:59 +02:00
Tomasz Grabiec
200562abe7 lsa: Reclaim over-max segments from segment pool reserve 2015-09-06 21:24:59 +02:00
Tomasz Grabiec
d022a1a4a3 lsa: Introduce allocating_section
Related to #259. In some cases we need to allocate memory and hold
reclaim lock at the same time. If that region holds most of the
reclaimable memory, allocations inside that code section may
fail. allocating_section is a work-around of the problem. It learns
how big reserves shold be from past execution of critical section and
tries to ensure proper reserves before entering the section.
2015-09-06 21:24:59 +02:00
Tomasz Grabiec
3caad2294b lsa: Tolerate empty segments when region is destroyed
Some times we may close an empty active segment, if all data in it was
evicted. Normally segments are removed as soon as the last object in
it is freed, but if the segment is already empty when closed, noone is
supposed to call free on it. Such segments would be quickly reclaimed
during compaction, but it's possible that we will destroy the region
before they're reclaimed by compaction. Currently we would fail on an
assertion which checks that there are no segments. This change fixes
the problem by handling empty closed segments when region is
destroyed.
2015-09-06 21:24:59 +02:00
Tomasz Grabiec
c37aa73051 lsa: Drop alignment requirement from segment 2015-09-06 21:24:59 +02:00
Tomasz Grabiec
2c1536b5a7 lsa: Make free() path noexcept
Memory releasing is invoked from destructors so should not throw. As a
consequence it should not allocate memory, so emergency segment pool
was switched from std::deque<> to an alloc-free intrusive stack.
2015-09-06 21:24:59 +02:00