The state machines generated by antlr allocate many local variables per function.
In release mode, the stack space occupied by the variables is reused, but in debug
build, it is not, due to Address Sanitizer setting -fstack-reuse=none. This causes
a single function to take above 100k of stack space.
Fix by hacking the generated code to use just one variable.
Fixes#2546
Message-Id: <20170704135824.13225-1-avi@scylladb.com>
* tag 'tgrabiec/row-cache-metrics-v2' of github.com:cloudius-systems/seastar-dev:
row_cache: Switch _stats.hits/misses to row granularity
row_cache: Rename num_entries() to partitions() for clarity
row_cache: Track mispopulations also at row level
row_cache: Track row insertions
row_cache: Track row hits and misses
row_cache: Make mispopulation counter also apply for continuity information
row_cache: Add partition_ prefix to current counters
misc_services: Switch to using reads_with[_no]_misses counters
row_cache: Add metrics for operations on underlying reader
row_cache: Add reader-related metrics
row_cache: Remove dead code
"This series introduces selective_token_range_sharder and uses it in repair to
generate dht::token_range belongs to a specific shard."
* tag 'asias/repair-selective_token_range_sharder-v3' of github.com:cloudius-systems/seastar-dev:
repair: Use selective_token_range_sharder
tests: Add test_selective_token_range_sharder
dht: Add selective_token_range_sharder
With this change, we ask all the shard to handle the ranges provided by
user and we use selective_token_range_sharder to split the ranges and
ignore the ranges do not belong to the current shard.
The test put a wrapping range into a non-wrapping range variable.
This was harmless at the time this test was written, but newer code
may not be as forgiving so better use a non-wrapping range as intended.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20170704103128.29689-1-nyh@scylladb.com>
`r` is moved-from, and later captured in a different lambda. The compiler may
choose to move and perform the other capture later, resulting in a use-after-free.
Fix by copying `r` instead of moving it.
Discovered by sstable_test in debug mode.
Message-Id: <20170702082546.20570-1-avi@scylladb.com>
Currently, lcs will choose, for tombstone compaction, sstable with
the lowest ratio from the ones which ratio is at least above threshold
(0.2 by default).
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20170703185633.6644-1-raphaelsc@scylladb.com>
* 'lcs_improvements_part_2' of github.com:raphaelsc/scylla:
lcs: Match estimated tasks arithmetic to score in LCS
lcs: prevent leveled_compaction_strategy.hh from being included more than once
lcs: use vector instead for storing a level of sstables
compaction: keep only one variant of size_tiered_most_interesting_bucket
lcs: get rid of unused code in leveled_manifest
Contains fix for CASSANDRA-8904.
Added TARGET_SCORE to get rid of magic number for target score which
is now used more than once.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
list is no longer needed because lcs no longer moves a sstable breaking
invariant at its level to level 0. Now lcs incrementally restores invariant
by compacting together first set of overlapping tables.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
two variants of size_tiered_most_interesting_bucket existed to avoid copy,
but subsequent work will make lcs use vector for each level of sstables,
so let's only keep one variant.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Repair today has a semaphore limiting the number of ongoing checksum
comparisons running in parallel (on one shard) to 100. We needed this
number to be fairly high, because a "checksum comparison" can involve
high latency operations - namely, sending an RPC request to another node
in a remote DC and waiting for it to calculate a checksum there, and while
waiting for a response we need to proceed calculating checksums in parallel.
But as a consequence, in the current code, we can end up with as many as
100 fibers all at the same stage of reading partitions to checksum from
sstables. This requires tons of memory, to hold at least 128K of buffer
(even more with read-ahead) for each of these fibers, plus partition data
for each. But doing 100 reads in parallel is pointless - one (or very few)
should be enough.
So this patch adds another semaphore to limit the number of checksum
*calculations* (including the read and checksum calculation) on each shard
to just 2. There may still be 100 ongoing checksum *comparisons*, in
other stages of the comparisons (sending the checksum requests to other
and waiting for them to return), but only 2 will ever be in the stage of
reading from disk and checksumming them.
The limit of 2 checksum calculations (per shard) applies on the repair
slave, not just to the master: The slave may receive many checksum
requests in parallel, but will only actually work on 2 at a time.
Because the parallelism=100 now rate-limits operations which use very little
memory, in the future we can safely increase it even more, to support
situations where the disk is very fast but the link between nodes has
very high latency.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20170703151329.25716-1-nyh@scylladb.com>
when do_for_each is in its last iteration and with_semaphore defers
because there's an ongoing cleanup, sstable object will be used after
freed because it was taken by ref and the container it lives in was
destroyed prematurely.
Let's fix it with a do_with, also making code nicer.
Fixes#2537.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20170630035324.19881-1-raphaelsc@scylladb.com>
"compaction_strategy.cc keeps the full implementation of size tiered,
major, and null strategies, and partial implementation of leveled
and date tiered strategies. It's a mess. In the future, we will also
need space for time window strategy. The file is hard to read and
maintain.
My goal here is to improve maintainability of the strategies by
putting each of them into its own header.
NOTE: No semantic change is introduced here."
* 'improve_compaction_strategy_maintainability' of github.com:raphaelsc/scylla:
compaction_strategy: move dtcs to its existing header
compaction_strategy: move lcs implementation to its own header
compaction_strategy: move stcs implementation to its own header
compaction_strategy: move compaction_strategy_impl to its own header
Configuring cpufreq service on VMs/IaaS causes an error because it doesn't supported cpufreq.
To prevent causing error, skip whole configuration when the driver not loaded.
Fixes#2051
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1498809504-27029-1-git-send-email-syuu@scylladb.com>
A comment states that we want the file to be old enough, but sets
a timestamp of max(), which is in the future. This may have passed
because the conversion from numeric_limits<time_t>::max() to
db_clock::time_point is not well defined (their dynamic range is
different), so truncation may have converted the large number to a
low one.
Message-Id: <20170702082903.20879-1-avi@scylladb.com>
Error messages incorrectly used the debug representation of the receiver,
rather than the text representation of the operation itself.
Fixes#113.
Message-Id: <20170701101325.3163-1-avi@scylladb.com>
Boot should not continue until a future returned by
wait_for_gossip_to_settle() is resolved. Commit 991ec4a16 mistakenly
broke that, so restore it back. Also fix calls for supervisor::notify()
to be in the right places.
Message-Id: <20170702082355.GQ14563@scylladb.com>
This reverts commit db5bf363d0. Causes
errors of the sort
Exiting on unhandled exception: exceptions::invalid_request_exception
(Keyspace 'system_traces' does not exist)
Previously, lexing and parsing errors were aggregated while CQL queries were
evaluated. Afterwards, the first collected error (if present) would be thrown as
an exception.
The problem was that when parsing and lexing errors were aggregated this way,
the parser would continue even in spite of errors like "no viable alternative".
Semantic actions attached to grammar rules would still execute, though with
variables that had not yet been initialized. This would crash Scylla.
This change modifies the error-handling strategy of CQL parsing. Rather than
aggregate errors, we throw an exception on the first error we encounter. This
ensures that grammar actions never execute unless there is a precise match.
One possible issue with this approach is that the generated C++ code from the
ANTLR grammar may not be exception-safe. I compiled Scylla in debug-mode with
ASan support and executed several erroneous CQL queries with `cqlsh`. No memory
leaks were reported.
Fixes#2466.
Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com>
Message-Id: <db1f650a2bbb615b506d9015486eece45375a440.1498836703.git.jhaberku@scylladb.com>