Range tombstone for a clustered row wasn't supported, so an assert
to remember that was being triggered.
Testcase was added.
Fixes#158.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
"This series cleans the streaming_histogram and the estimated histogram that
were importad from origin, it then uses it to get the estimated min and max row
estimation in the API."
The solution was proposed by Nadav. When writing a new sstable,
write all usual files, write the TOC to a temporary file, and
then rename it, which is atomic.
Files not belonging to any TOC are invalid, so we ensure that
partially written sstables aren't reused.
Avi also proposed using fsync on the sstable directory to guarantee
that the files reached the disk before sealing the sstable.
Subsequently, we should add code to avoid loading sstable which
TOC is either temporary or doesn't exist. Temporary TOC files
should also be deleted.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
This changes the constructor initilization of the metadata_collecr, it
would call the constructor directly without the java-like assignment.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This do the following chagnes in the estimated_histogram, it uses
int64_t over unsigned to be compatible to origin and the API.
It adds a getter to the buckets and change the getteer to the
bucket_offset to be const.
It adds a get min and max similiar to origin. And it adds a merge
function to merge estimated histogram.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
The default constructor need to set the the max_bin size, so it was
combine with the non default one, with a default value.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
What we implement is ka, not la. Since the summary is the one element that
actually changed in the 2.2 implementation, it is particularly important that
we get this one right. I have previously missed this.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
(endsize / (1024*1024)) is an integer calculation, so if endsize is
lower than 1024^2, the result would be 0.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Don't let the current name fool you: Having this listed as "la" here
was just lack of discipline on my part. I meant by it "the format from
which we are importing" - which was named la for Origin. I wasn't
really thinking at the time that it would be dangerous to stop between
versions.
This should read ka, not la.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
When a schema is available, we use it. However, we have, by now, way too many
tests. Some of them use tables for which we don't even know the schema. It would
have been a massive amount of work to require a schema for all of them - so I am
keeping both constructors around.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
We have currently two versions of filename: one static, where the caller has to
pass all parameters, and an internal one where those parameters are derived
from the sstable attributes. Implement the latter in terms of the former so
making changes gets easier.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
It was initially created to be the function to write all sstable
components, but later on, its purpose was only to write a few
components for testing. A similar function was created in the
tests, so now it can be removed.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
We are currently failing the sstable test. The reason is that we use the store()
function for test purposes, and that function does not store the TOC component.
It was removed by Aviccident in 3a5e3c88.
Because that function is only used for testing purposes, it doesn't need to write
the Index and Data components: we can then remove them from the list.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
When forcing a compaction on a column family with no sstables, an
assert will fail because there is no sstables to be compacted.
This problem is fixed by ignoring a compaction request when no
sstable is provided.
Fixes#61.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>
The reason is that the reader may think that these fields store
some statistics information about a sstable just loaded, but
they are only used when writing a new sstable.
Now I'm starting to see the value of having a sstable class for
a sstable loaded and another one for a sstable being created
(that's what Origin does).
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
The sstables write path has been partially de-futurized, but now creates a
ton of threads, and yet does not exploit this as everything is serialized.
Remove those extra threads and futures and use a single thread to write
everything. If needed, we'll employ write-behind in output_stream to
increase parallelism.
Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>
range::is_wrap_around() and range::contains() rely on total ordering
on values to work properly. Current ring_position_comparator was only
imposing a weak ordering (token positions equal to all key positions
with that token).
range::before() and range::after() can't work for weak ordering. If
the bound is exclusive, we don't know if user-provided token position
is inside or outside.
Also, is_wrap_around() can't properly detect wrap around in all
cases. Consider this case:
(1) ]A; B]
(2) [A; B]
For A = (tok1) and B = (tok1, key1), (1) is a wrap around and (2) is
not. Without total ordering between A and B, range::is_wrap_around() can't
tell that.
I think the simplest soution is to define a total ordering on
ring_position by making token positions positioned either before or
after all keys with that token.
Support to compaction strategy options was recently added.
Previously, we were using default values in compaction strategy for
options, but now we can use the options defined in the schema.
Currently, we only support size-tiered strategy, so let's start
with it.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
All sharded services "should" define a stop method. Calling them is also
a good practice. For this one specifically, though, we will not call stop.
We miss a good way to add a Deleter to a shared_ptr class, and that would
be the only reliable way to tie into its lifetime.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
We still want to wrap it instead of writing the column name directly, so we are
able to update the statistics.
It is better to have a separate function for this, because write_column_name
doesn't have enough information to decide when to do what. Augmenting it so we
could have would require passing the schema, or an extra parameter, which would
then spread to all callers.
Keep in mind that testing for an empty clustering key is not enough, since
composite types will serialize the empty clustering key in this case.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Get values from cf->schema instead of using hardcoded threshold
values. In addition, move DEFAULT_MIN_COMPACTION_THRESHOLD and
DEFAULT_MAX_COMPACTION_THRESHOLD to schema.hh so as not to have
knowledge duplicated.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Since parsing involves a unique_ptr<metadata> holding a pointer to a
subclass of metadata, it must define a virtual destructor, or it can
cause memory leaks when deleted, or, with C++14 sized deallocators, it
can cause the wrong memory pool to be used for deleting the object.
Seen on EC2.
Define a virtual destructor to tell the compiler how to destroy
and free the object.