Separating the initial value (and accumulator) from the reducer function
can result in simpler invocations.
Unfortunately, the name conflicts with another variant, so we have to name
the method map_reduce0.
Shuffle code in sstables.hh so that public members are defined first.
Makes the code easier to digest.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
There's no benefit to using C include guards so switch to pragma once
everywhere for consistency.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Should fix use-after-free when a frozen_mutation is applied to the
local shard.
Includes two adjustments to urchin collectd usage from Calle:
- Updated thrift collectd registration to use proper move semantics
- Commitlog: Fix collectd registration to use move semantics + test
Signed-off-by: Calle Wilund <calle@cloudius-systems.com>
When submit_to() calls to a different core, and when the function to
be executed is a temporary (the usual case), we copy it to the heap for
the duration of execution. However when the function happens to execute
locally, we don't copy it, which can lead to a use-after-free if the function
defers.
Fix by detecting the case of local execution of a temporary function, and
copying it to the heap in that case.
From Pekka:
"This series implements a Maps.difference() function in C++, changes
storage_proxy::query_local() to not return foreign_ptr>, and finally
changes the keyspace merging code to follow Origin."
Don't return foreign_ptr<> which is not copyable so that we can use
map_difference for maps with result_set in them.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Obviously, I was sleeping or something when I wrote the reg/unreg code, since
using copy semantics for anchors is equivalent with double unregistrations.
Luckily, unregister was broken as well, so counters did stay active. Which
however broke things once actual non-persistent counters were added. Doh.
* Anchors must be non-copyable
* Above makes creating std::vector<registration> from initializer list
tricky, so added helper type "registrations" which inherits vector<reg>
but constructs from initializer_list<type_instance_id>, avoiding illegal
copying.
* Both register and unregister were broken (map semantics does not overwrite
on insert, only [] or iterator operation).
* Modified the various registration callsites to use registrations and move
semantics.
This implements Maps.difference() helper function in C++. We need it to
translate various Origin call-sites that use it. The implementation is
written from scratch with Guava code used as reference to preserve
semantics.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
I didn't want to create another executable for it because doing so
would take a lot more of disk space. The motivation behind this
change is to debloat the sstables test file, which is growing very
quickly. Suggested by Glauber Costa.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Collectd uses an exception to signal that the buffer space in the packet
is exhausted and a new packet needs to be started. This violates the
"exceptions are for exceptional conditions" guideline, but more practically,
makes it hard to use the gdb "catch throw" command to trap exceptions.
Fix by using a data member to store the overflow condition instead.
- # segments
- # allocting segments
- # unused segments
- # allocations
- # cycles (disk writes)
- # flush
- # total bytes allocated
- # total bytes disk slack (due to dma blocks)
Counters are per-commitlog (shard). Can be extended to be per-segment also,
but would be transient and probably not much more useful.
Signed-off-by: Calle Wilund <calle@cloudius-systems.com>
Deleted cells store deletion time not expiry time. This change makes
expiry() valid only for live cells with TTL and adds deletion_time(),
which is inteded to be used with deleted cells.
Introduce frozen_mutation, from Tomasz:
"The immediate motivation for introducing frozen_mutation is inability to
deserialize current "mutation" object, which needs schema reference at the
time it's constructed. It needs schema to initialize its internal maps with
proper key comparators, which depend on schema. We can't lookup schema before
we deserialize column family ID. Another problem is that even if we had the ID
somehow, low level RPC layer doesn't know how to lookup the schema.
This form is primarily destined to be sent over the network channel. Data can
be wrapped in frozen_mutation without schema information, the schema is only
needed to access some of the fields.
frozen_mutation supports reading via visiting, without having to explode it
into an object graph. Because of that, application should be more efficient
because of fewer cache misses.
We also don't have to serialize it back for the commit log, it's serialized
only once on the coordinator node.
Different serialization formats are not supported yet."
Reviewed-by: Pekka Enberg <penberg@cloudius-systems.com>
When the temporary buffer has enough data for a uint64 to be
consumed, we readily consume it.
The problem is that we were wrongly storing the uint64 into
a uint32 variable.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
When the temporary buffer has enough data for a uint64 to be
consumed, we readily consume it.
The problem is that we were wrongly storing the uint64 into
a uint32 variable.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
For fluent asserting mutations:
assert_that(what_we_got).is_equal_to(what_we_expect);
We could extend that in future with more specific checks, like
has_cell() etc.
The immediate motivation for introducing frozen_mutation is inability
to deserialize current "mutation" object, which needs schema reference
at the time it's constructed. It needs schema to initialize its
internal maps with proper key comparators, which depend on schema.
frozen_mutation is an immutable, compact form of a mutation. It
doesn't use complex in-memory strucutres, data is stored in a linear
buffer. In case of frozen_mutation schema needs to be supplied only at
the time mutation partition is visited. Therefore it can be trivially
deserialized without schema.