Many area of the code are splattered with unneeded templates. This patchset replaces
some of them, where the template parameter is a function object, with an std::function
or noncopyable_function (with a preference towards the latter; but it is not always
possible). As the template is compiled for each instantiation (if the function
object is a lambda) while a function is compiled only once, there are significant
savings in compile time and bloat.
text data bss dec hex filename
85160690 42120 284910 85487720 5187068 scylla.before
84824762 42120 284910 85151792 5135030 scylla.after
* https://github.com/avikivity/scylla detemplate/v2:
api/commitlog: de-template acquire_cl_metric()
database: de-template do_parse_schema_tables
database: merge for_all_partitions and for_all_partitions_slow
hints: de-template scan_for_hints_dirs()
schema_tables: partially de-template make_map_mutation()
distributed_loader: de-template
tests: commitlog_test: de-template
tests: cql_auth_query_test: de-template
test: de-template eventually() and eventually_true()
tests: flush_queue_test: de-template
hint_test: de-template
tests: mutation_fragment_test: de-template
test: mutation_test: de-template
"
This miniseries fixes the behaviour of distributed loader,
which now unconditionally mutates new sstables found in /upload
dir to LCS level 0 first, and only after that proceeds with
either queueing them for update generation or moving them
to data directory.
"
* 'restore_always_mutating_sstables_level_0' of https://github.com/psarna/scylla:
distributed_loader: restore indentation
distributed_loader: restore always mutating to level 0
Fix runtime error: signed integer overflow
introduced by 2dc3776407
Delta-encoded values may wrap around if the encoded value is
less than the base value. This could happen in two places:
In the mc-format serialization header itself, where the base values are implicit
Cassandra epoch time, and in the sstables data files, where the base values
are taken from the encoding_stats (later written to the serialization_header).
In these cases, when the calculation is done using signed integer/long we may see
"runtime error: signed integer overflow" messages in debug mode
(with -fsanitize=undefined / -fsanitize=signed-integer-overflow).
Overflow here is expected and harmless since we do not gurantee that
neither the base values in the serialization header are greater than
or equal to Cassandra's epoch now that the delta-encoded values are
always greater than or equal to the respective base values in
the serialization header.
To prevent these warnings, the subtraction/addition should be done with unsigned
(two's complement) arithmetic and the result converted to the signed type.
Note that to keep the code simple where possible, when also rely on implicit
conversion of signed integers to unsigned when either one of added value is unsigned
and the other is signed.
Fixes: #4098
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20190120142950.15776-1-bhalevy@scylladb.com>
The internal test_propagation template is instantiated many times. Replace
with an oridinary function to reduce bloat. Call sites adjusted to have a
uniform signature.
distributed_loader has several large templates that can be converted to normal
function with the help of noncopyable_function<>, reducing code bloat.
One of the lambdas used as an actual argument was adjusted, because the de-templated
callee only accepts functions returning a future, while the original accepted both
functions returning a future and functions returning void (similar to future::then).
This long slow-path function is called four times, so de-templating it is an
easy win. We use std::function instead of noncopyable_function because the
function is copied within the parallel_for_each callback. The original code
uses a move, which is incorrect, but did not fail because moving the lambdas
that were used as the actual arguments is equivalent to a copy.
Otherwise read test results for subsequent datasets will override each other.
Also, rename population test case to not include dataset name, which
is now redundant.
Message-Id: <1547822942-9690-1-git-send-email-tgrabiec@scylladb.com>
When entering a new ck range (of the partition-slice), the partition
snapshot reader will apply to its range tombstones stream all the
tombstones that are relevant to the new ck range. When the partition has
range tombstones that overlap with multiple ck ranges, these will be
applied to the range tombstone stream when entering any of the ck ranges
they overlap with. This will result in the violation of the monotonicity
of the mutation fragments emitted by the reader, as these range
tombstones will be re-emitted on each ck range, if the ck range has at
least one clustering row they apply to.
For example, given the following partition:
rt{[1,10]}, cr{1}, cr{2}, cr{3}...
And a partition-slice with the following ck ranges:
[1,2], [3, 4]
The reader will emit the following fragment stream:
rt{[1,10]}, cr{1}, cr{2}, rt{[1,10]}, cr{3}, ...
Note how the range tombstone is emitted twice. In addition to violating
the monotonicity guarantee, this can also result in an explosion of the
number of emitted range tombstones.
Fix by applying only those range tombstones to the range tombstone
stream, that have a position strictly greater than that of the last
emitted clustering row (or range tombstone), when entering a new ck
range.
Fixes: #4104
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <e047af76df75972acb3c32c7ef9bb5d65d804c82.1547916701.git.bdenes@scylladb.com>
At the moment are inefficiencies in how
collection_type_impl::mutation::compact_and_expire( handles tombstones.
If there is a higher-level tombstone that covers the collection one
(including cases where there is no collection tombstone) it will be
applied to the collection tombstone and present in the compaction
output. This also means that the collection tombstone is never dropped
if fully covered by a higher-level one.
This patch fixes both those problems. After the compaction the
collection tombstone is either unchanged or removed if covered by a
higher-level one.
Fixes#4092.
Message-Id: <20190118174244.15880-1-pdziepak@scylladb.com>
Use std::function instead of a template parameter. Likely doesn't gain
anyting, because the template was always instantiated with the same type
(the result of std::bind() with the same signatures), but still good practice.
std::function was used instead of noncopyable_function because
sharded::map_reduce0() copies the input function.
"
Use input sstables stats metadata to re-calculate encoding_stats.
Fixes#3971.
"
* 'projects/compaction-encoding-stats/v3' of https://github.com/bhalevy/scylla:
compaction: mc: re-calculate encoding_stats based on column stats
memtable: extract encoding_stats_collector base class to encoding_stats header file
The compilation fails on -Warray-bounds, even though the branch is never taken:
inlined from ‘managed_bytes::managed_bytes(bytes_view)’ at ./utils/managed_bytes.hh:195:22,
inlined from ‘managed_bytes::managed_bytes(const bytes&)’ at ./utils/managed_bytes.hh:162:77,
inlined from ‘dht::token dht::bytes_to_token(bytes)’ at dht/random_partitioner.cc:68:57,
inlined from ‘dht::token dht::random_partitioner::get_token(bytes)’ at dht/random_partitioner.cc:85:39:
/usr/include/c++/8/bits/stl_algobase.h:368:23: error: ‘void* __builtin_memmove(void*, const void*, long unsigned int)’ offset 16 from the object at ‘<anonymous>’ is out of the bounds of referenced subobject ‘managed_bytes::small_blob::data’ with type ‘signed char [15]’ at offset 0 [-Werror=array-bounds]
__builtin_memmove(__result, __first, sizeof(_Tp) * _Num);
~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Work around by disabling the diagnostic locally.
Message-Id: <1547205350-30225-1-git-send-email-tgrabiec@scylladb.com>
Many area of the code are splattered with unneeded templates. This patchset replaces
some of them, where the template parameter is a function object, with an std::function
or noncopyable_function (with a preference towards the latter; but it is not always
possible). As the template is compiled for each instantiation (if the function
object is a lambda) while a function is compiled only once, there are significant
savings in compile time and bloat.
text data bss dec hex filename
85160690 42120 284910 85487720 5187068 scylla.before
84824762 42120 284910 85151792 5135030 scylla.after
* https://github.com/avikivity/scylla detemplate/v1:
api/commitlog: de-template acquire_cl_metric()
database: de-template do_parse_schema_tables
database: merge for_all_partitions and for_all_partitions_slow
hints: de-template scan_for_hints_dirs()
schema_tables: partially de-template make_map_mutation()
distributed_loader: de-template
tests: commitlog_test: de-template
tests: cql_auth_query_test: de-template
test: de-template eventually() and eventually_true()
tests: flush_queue_test: de-template
hint_test: de-template
tests: mutation_fragment_test: de-template
test: mutation_test: de-template
When introducing view update generation path for sstables
in /upload directory, mutating these sstables was moved
to regular path only. It was wrong, because sstables that
need view updates generated from them may still need
to be downgraded to LCS level 0, so they won't disrupt
LCS assumptions after being loaded.
Reported-by: Nadav Har'El <nyh@scylladb.com>
The internal test_propagation template is instantiated many times. Replace
with an oridinary function to reduce bloat. Call sites adjusted to have a
uniform signature.
Use noncopyable_function instead of a template parameter. Likely doesn't gain
anyting, because the template was always instantiated with the same type
(the result of std::bind() with the same signatures), but still good practice.
The multishard mutation query used the semaphore obtained from
`database::user_read_concurrency_sem()` to pause-resume shard readers.
This presented a problem when `multishard_mutation_query()` was reading
from system tables. In this case the readers themselves would obtain
their permits from the system read concurrency semaphore. Since the
pausing of shard readers used the user read semaphore, pausing failed to
fulfill its objective of alleviating pressure on the semaphore the reads
obtained their permits from. In some cases this lead to a deadlock
during system reads.
To ensure the correct semaphore is used for pausing-resuming readers,
obtain the semaphore from the `table` object. To avoid looking up the
table on every pause or resume call, cache the semaphores when readers
are created.
Fixes: #4096
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <c784a3cd525ce29642d7216fbe92638fa7884e88.1547729119.git.bdenes@scylladb.com>
d2dbbba139 converted scyllatop's interperter to Python 3, but neglected to do
the actual conversion. This patch does so, by running 2to3 over allfiles and adding
an additional bytes->string decode step in prometheus.py. Superfluous 2to3 changes
to print() calls were removed.
Message-Id: <20190117124121.7409-1-avi@scylladb.com>
"
Before this series the limit was applied per page instead
of globally, which might have resulted in returning too many
rows.
To fix that:
1. restrictions filter now has a 'remaining' parameter
in order to stop accepting rows after enough of them
have already been accepted
2. pager passes its row limit to restrictions filter,
so no more rows than necessary will be served to the client
3. results no longer need to be trimmed on select_statement
level
Tests: unit (release)
"
* 'fix_filtering_limit_with_paging_3' of https://github.com/psarna/scylla:
tests: add filtering+limit+paging test case
tests: allow null paging state in filtering tests
cql3: fix filtering with LIMIT with regard to paging
Previously the utility to extract paging state asserted
that the state exists, but in future tests it would be useful
to be able to call this function even if it would return null.
Previously the limit was erroneously applied per page
instead of being accumulated, which might have caused returning
too many rows. As of now, LIMIT is handled properly inside
restrictions filter.
Fixes#4100
When compacting several sstables, get and merge their encoding_stats
for encoding the result.
Introduce sstable::get_encoding_stats_for_compaction to return encoding_stats
based on the sstable's column stats.
Use encoding_stats_collector to keep track of the minimum encoding_stats
values of all input sstables.
Fixes#3971
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>