Currently, we create a temporary TOC file after we are done writing
all the other components. However, we want to create a temporary
TOC before starting to write any other component.
So if there is a missing TOC, there is likely to be a corruption,
so we should refuse to boot and provide the sysadmin with a
detailed message. If there is a temporary TOC, it means that there
was a sudden shutdown while the sstable was being written.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
After killing scylla in the middle of a write, the next scylla
instance failed to finish commit log replay, showing the following
error message:
scylla: core/future.hh:448: void promise<T>::set_value(A&& ...)
[with A = {}; T = {}]: Assertion `_state' failed.
After a long debug session, I figured out that check_valid_rp() was
triggering the exception replay_position_reordered_exception, which
means replay position reordering.
Looking at 8b9a63a3c6, I noticed that database::apply is guarded
against reodering, but commitlog replay code is not.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
compression
Timers support expect the API to return histogram. This adds the swagger
definition for the following timer in column_family:
get_coordinator_read_latency
get_coordinator_scan_latency
get_waiting_on_free_memtable_space
The following estimated histogram were added to column_family:
get_read_latency_estimated_recent_histogram
get_read_latency_estimated_histogram
get_range_latency_estimated_recent_histogram
get_range_latency_estimated_histogram
get_write_latency_recent_histogram
get_write_latency_estimated_recent_histogram
get_cas_prepare_estimated_recent_histogram
get_cas_prepare_estimated_histogram
get_cas_propose_estimated_recent_histogram
get_cas_propose_estimated_histogram
get_cas_commit_estimated_recent_histogram
get_cas_commit_estimated_histogram
And the following timers in commitlog:
get_waiting_on_segment_allocation
get_waiting_on_commit
To column family API the following API were added:
set_compaction_strategy_class
get_compaction_strategy_class
set_compression_parameters
set_crc_check_chance
get_sstable_count_per_level
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
API: Completing the column_family Swagger definition
This adds the missing definition in the column_family to make it
compatible to ColumnFamilyStoreMbean
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
We set -march=nehalem to get a cross-microarch binary that supports the
crc32 instruction, but we don't configure seastar with that flag; so seastar
instead inherits dpdk's default, which is -march=native. The resulting
binary may not run on hosts other than the one it was compiled on.
Fix by passing the flag to seastar and inheriting it from there.
Fixes#338.
* seastar d693a64...49989ca (2):
> memory: fix freeing of very small (<8 byte) objects with sized deallocation
> core: take into an account the exceptions thrown directly from _next() as well
Fixes#341.
Since the changes in c0547160f4
"cql3: make ref to client_state const where possible" prepare_keyspace()
wasn't overriding what it was supposed to override.
Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
"This series complete the changes for nodetool info
It adds the following: in cache service API save priod will be reported as 0,
to indicate never
In column family, as a workaround for the bloom filter memeory consumption
statistic, the API would return 0."
The bloom filter memory calculation is missing, as a workaround until
it will be completed, the memory calculation will return 0.
It is needed by the nodetool info command.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
We do not save the cache, so the get for save priod in second should
return 0, to indicate never like it is done in origin.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
get_query_state() creates new query state if it cannot find existing
one. Because of that it cannot be safely called from multiple cpus.
The solution is taking advantage form the fact that
query_processor::prepare() only needs a const reference to client_state
object.
Fixes#329.
Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
"The motivation is to exercise more code during tests, and possibly also avoid
some special casing just for tests in the future. Sstables will be persisted
in a unique temporary directory which is auto-removed when environment is
torn down."
The segment heap is a max-heap, with sparser segments on the top. When
we free from a segment its occupancy is decreased, but its position in
the heap increases.
This bug caused that we picked up segments for compaction in the wrong
order. In extreme cases this can lead to a livelock, in some cases may
just increase compaction latency.
Digest resolver is broken in a way that prevents read completion to
be reported if data arrives after enough digests for cl were already
received. This happens because the code tried to save on a state and
used _cl_responses as an indicator that completion was reported already,
but this is incorrect since there can be enough responses for cl, but no
data yet. Fix by introducing special state to track completion reporting.
Fixes#331
This patch addresses issue #155. It register an exception handler
API of the routes object that handle the no_such_keyspace exception.
The handler just throw a bad_parameter_exception with the error message
it got from the no_such_keyspace exception.
After this patch a call with a keyspace that does not exist, will return
a 400 result.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
Until we'll support key and counter cache, it is reasonable to return 0
for their statistic and sizes.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
Currently cache update which from a flushed memtable affects hits and
misses, which may be confusing. Let's reserve hits and misses for
reads. Cache update will affect counters called "insertions" and
"merges".
We need to send out the notification for all created keyspaces, not just
for the first one.
Spotted during code inspection.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Also fixes https://github.com/cloudius-systems/seastar/issues/54
==5658==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6250006b7848 at pc 0x1413e02 bp 0x7fff7cd7f1e0 sp 0x7fff7cd7f1d8
WRITE of size 8 at 0x6250006b7848 thread T0
#0 0x1413e01 in unsigned long* std::__copy_move<false, false, std::random_access_iterator_tag>::__copy_m<std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long*>(std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long*) /usr/include/c++/4.9/bits/stl_algobase.h:336
#1 0x1413c59 in unsigned long* std::__copy_move_a<false, std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long*>(std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long*) /usr/include/c++/4.9/bits/stl_algobase.h:396
#2 0x1413aea in unsigned long* std::__copy_move_a2<false, std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long*>(std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long*) /usr/include/c++/4.9/bits/stl_algobase.h:434
#3 0x14138df in unsigned long* std::copy<std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long*>(std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long*) /usr/include/c++/4.9/bits/stl_algobase.h:466
#4 0x1413545 in unsigned long* std::__copy_n<std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long, unsigned long*>(std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long, unsigned long*, std::random_access_iterator_tag) /usr/include/c++/4.9/bits/stl_algo.h:779
#5 0x1412d44 in unsigned long* std::copy_n<std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long, unsigned long*>(std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long, unsigned long*) /usr/include/c++/4.9/bits/stl_algo.h:804
#6 0x14112b3 in unsigned long large_bitset::load<std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*> >(std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long) utils/large_bitset.hh:81
#7 0x13fcfc9 in _ZZZN8sstables7sstable11read_filterEvENKUlRT_E_clINS_6filterEEEDaS2_ENKUlvE_clEv (/home/tgrabiec/src/urchin/build/debug/scylla+0x13fcfc9)
#8 0x1400a50 in apply /home/tgrabiec/src/urchin/seastar/core/apply.hh:34
#9 0x1400afb in apply<sstables::sstable::read_filter()::<lambda(auto:25&)> [with auto:25 = sstables::filter]::<lambda()> > /home/tgrabiec/src/urchin/seastar/core/apply.hh:42
#10 0x1400bb2 in apply<sstables::sstable::read_filter()::<lambda(auto:25&)> [with auto:25 = sstables::filter]::<lambda()> > /home/tgrabiec/src/urchin/seastar/core/future.hh:1062
#11 0x140f1b7 in _ZZN6futureIIEE4thenIZZN8sstables7sstable11read_filterEvENKUlRT_E_clINS2_6filterEEEDaS5_EUlvE_S0_EET0_OT_ENUlOS4_E_clI12future_stateIIEEEEDaSC_ (/home/tgrabiec/src/urchin/build/debug/scylla+0x140f1b7)
#12 0x140f350 in run /home/tgrabiec/src/urchin/seastar/core/future.hh:359
#13 0x426e2c in reactor::run_tasks(circular_buffer<std::unique_ptr<task, std::default_delete<task> >, std::allocator<std::unique_ptr<task, std::default_delete<task> > > >&, unsigned long) core/reactor.cc:1093
#14 0x429cb1 in reactor::run() core/reactor.cc:1190
#15 0x72bc69 in app_template::run_deprecated(int, char**, std::function<void ()>&&) core/app-template.cc:122
#16 0xa119bc in main /home/tgrabiec/src/urchin/main.cc:279
#17 0x7ffc1b6beec4 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21ec4)
#18 0x412558 (/home/tgrabiec/src/urchin/build/debug/scylla+0x412558)
0x6250006b7848 is located 0 bytes to the right of 8008-byte region [0x6250006b5900,0x6250006b7848)
allocated by thread T0 here:
#0 0x7ffc1cf6c7df in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.1+0x547df)
#1 0x7ffc204eef17 in operator new(unsigned long) (/usr/lib/x86_64-linux-gnu/libstdc++.so.6+0x8df17)
#2 0xfa5d4f in large_bitset::large_bitset(unsigned long) utils/large_bitset.cc:15
#3 0x13fcec6 in _ZZZN8sstables7sstable11read_filterEvENKUlRT_E_clINS_6filterEEEDaS2_ENKUlvE_clEv (/home/tgrabiec/src/urchin/build/debug/scylla+0x13fcec6)
#4 0x1400a50 in apply /home/tgrabiec/src/urchin/seastar/core/apply.hh:34
#5 0x1400afb in apply<sstables::sstable::read_filter()::<lambda(auto:25&)> [with auto:25 = sstables::filter]::<lambda()> > /home/tgrabiec/src/urchin/seastar/core/apply.hh:42
#6 0x1400bb2 in apply<sstables::sstable::read_filter()::<lambda(auto:25&)> [with auto:25 = sstables::filter]::<lambda()> > /home/tgrabiec/src/urchin/seastar/core/future.hh:1062
#7 0x140f1b7 in _ZZN6futureIIEE4thenIZZN8sstables7sstable11read_filterEvENKUlRT_E_clINS2_6filterEEEDaS5_EUlvE_S0_EET0_OT_ENUlOS4_E_clI12future_stateIIEEEEDaSC_ (/home/tgrabiec/src/urchin/build/debug/scylla+0x140f1b7)
#8 0x140f350 in run /home/tgrabiec/src/urchin/seastar/core/future.hh:359
#9 0x426e2c in reactor::run_tasks(circular_buffer<std::unique_ptr<task, std::default_delete<task> >, std::allocator<std::unique_ptr<task, std::default_delete<task> > > >&, unsigned long) core/reactor.cc:1093
#10 0x429cb1 in reactor::run() core/reactor.cc:1190
#11 0x72bc69 in app_template::run_deprecated(int, char**, std::function<void ()>&&) core/app-template.cc:122
#12 0xa119bc in main /home/tgrabiec/src/urchin/main.cc:279
#13 0x7ffc1b6beec4 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21ec4)
SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/include/c++/4.9/bits/stl_algobase.h:336 unsigned long* std::__copy_move<false, false, std::random_access_iterator_tag>::__copy_m<std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long*>(std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, std::_Deque_iterator<unsigned long, unsigned long&, unsigned long*>, unsigned long*)
Shadow bytes around the buggy address:
0x0c4a800ceeb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c4a800ceec0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c4a800ceed0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c4a800ceee0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c4a800ceef0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c4a800cef00: 00 00 00 00 00 00 00 00 00[fa]fa fa fa fa fa fa
0x0c4a800cef10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c4a800cef20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c4a800cef30: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c4a800cef40: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c4a800cef50: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Heap right redzone: fb
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack partial redzone: f4
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Contiguous container OOB:fc
ASan internal: fe
==5658==ABORTING
This adds the implementation for hte stream_manager API.
It goes over all stream, on all shards and combine the result to a
vector of streams.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This expose the initiated_streams and the receiving_streams in the
stream_manager so the API would have access to it.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This adds the stream_manager swagger definition file. It is based on the
StreamManagerMBean definition and the return class based on StreamState
class and its sub classes.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This patch clear the ambiguity in the swagger definition file and adds
the implementation for the memtable memory related methods.
For each column family there is an active memtable and a list of non
active.
when refering the all the memtable in the column family, the nick name
will contain cf_all_memtables.
Each URL has two versions, one, with a column family name, that is
relevant to a specific column family and one without, which is the
result of running the method on all column families.
This patch adds the following implementation to column_family:
get_memtable_on_heap_size
get_all_memtable_on_heap_size
get_memtable_off_heap_size
get_all_memtable_off_heap_size
get_memtable_live_data_size
get_all_memtable_live_data_size
get_all_memtables_on_heap_size
get_all_all_memtables_on_heap_size
get_all_memtables_off_heap_size
get_all_all_memtables_off_heap_size
get_all_memtables_live_data_size
get_all_all_memtables_live_data_size
Memory consumption is map this way: All memory assume to be off heap, so
on heap will return 0, and off heap will return the memory consumption
After this patch the following URL will be available:
/column_family/metrics/memtable_on_heap_size/{name}
/column_family/metrics/memtable_on_heap_size
/column_family/metrics/memtable_off_heap_size/{name}
/column_family/metrics/memtable_off_heap_size
/column_family/metrics/memtable_live_data_size/{name}
/column_family/metrics/memtable_live_data_size
/column_family/metrics/all_memtables_on_heap_size/{name}
/column_family/metrics/all_memtables_on_heap_size
/column_family/metrics/all_memtables_off_heap_size/{name}
/column_family/metrics/all_memtables_off_heap_size
/column_family/metrics/all_memtables_live_data_size/{name}
/column_family/metrics/all_memtables_live_data_size
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
column_family
This patch adds a getter for the dirty_memory_region_group in the
database object and add an occupency method to column family that
returns the total occupency in all the memtable in the column family.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>