* Issue the "stop" method on DB (flushed CL + tables (partially))
* Do hard exit (_exit) to escape destructors and sanity checks.
This patch is horrible but sort of a workaround for various interdepdency
shutdown issues. Until services can actually be turned off, this might be
a viable option.
Refs #293. I will not call it a fix.
"Refs #293
* Add a commitlog::sync_all_segments, that explicitly forces all pending
disk writes
* Only delete segments from disk IFF they are marked clean. Thus on partial
shutdown or whatnot, even if CL is destroyed (destructor runs) disk files
not yet clean visavi sstables are preserved and replayable
* Do a sync_all_segments first of all in database::stop.
Exactly what to not stop in main I leave up to others discretion, or at least
another patch."
From Pawel:
This series makes compaction remove items that are no longer items:
- expired cells are changed into tombstones
- items covered by higher level tombstones are removed
- expired tombstones are removed if possible
Fixes#70.
Fixes#71.
It's great to have statistics, but assert is too big of a hammer. We don't need
to crash due to the lack of it, and can try our best to continue.
We currently have a problem (described in 265), in which we, for some reason,
fail to read the Statistics file. Throwing an exception will still cause us to
fail to boot, but at least it will be more informative.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
A region being merged can still be in use; but after merging, compaction_lock
and the reclaim counter will no longer work. This can lead to
use-after-compact-without-re-lookup errors.
Fix by making the source region be the same as the target region; they
will share compaction locks and reclaim counters, so lookup avoidance
will still work correctly.
Fixes#286.
Route request to CPU 0. _operation_mode is not replicated to other CPUS.
Without this:
$ curl -X GET --header "Accept: application/json"
"http://127.0.0.1:10000/storage_service/operation_mode"
returns "NORMAL" and "STARTING" randomly.
Only cpu 0 instance of gossip has the correct information, route request
to cpu 0.
Fix a bug where
$ curl -X GET --header "Accept: application/json"
"http://172.31.5.77:10000/storage_service/gossiping"
returns true and false randomly.
Not all the API command are implemented it would be better that the user
would receive an error if it tries to call an unimplmeneted API call.
This adds an unimplemented_exception that would be thrown when an API
call is not implemented.
The unimplemented method, simply throws the exception.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This replaces the http configuration to use the general configuration
object instead of the command line argument. This will allow to
configure the API from configuration file and not just from the command
line.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This adds the API configuration parameters to the configurtion, so it
will be possible to take them from the configuration file or from the
command line.
The following configuration were defined:
api_port
api_address
api_ui_dir
api_doc_dir
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
Refs #293
IFF one desires to _not_ shutdown stuff cleanly, still running this first
in database::stop will at least ensure that mutations already in CL transit
will end up on disk and be replayable
Also at seastar-dev: calle/commitlog_flush_v3
(And, yes, this time I _did_ update the remote!)
Refs #262
Commit of original series was done on stale version (v2) due to authors
inability to multitask and update git repos.
v3:
* Removed future<> return value from callbacks. I.e. flush callback is now
only fully syncronous over actual call
If several mutation in a batch throw exceptions have_cl.broken() will be
called more then once. Fix this by dropping ad hoc have_cl and use
parallel_for_each() that does the same thing that current code is doing.
Fixes#297
"Fixes #262
Handles CL disk size exceeding configured max size by calling flush handlers
for each dirty CF id / high replay_position mark. (Instead of uncontrolled
delete as previously).
* Increased default max disk size to 8GB. Same as Origin/scylla.yaml (so no
real change, but synced).
* Divide the max disk size by cpus (so sum of all shards == max)
* Abstract flush callbacks in CL
* Handler in DB that initiates memtable->sstable writes when called.
Note that the flush request is done "syncronously" in new_segment() (i.e.
when getting a new segment and crossing threshold). This is however more or
less congruent with Origin, which will do a request-sync in the corresponding
case.
Actual dealing with the request should at least in production code however be
done async, and in DB it is, i.e. we initiate sstable writes. Hopefully
they finish soon, and CL segments will be released (before next segment is
allocated).
If the flush request does _not_ eventually result in any CF:s becoming
clean and segments released we could potentially be issuing flushes
repeatedly, but never more often than on every new segment."