"I.e. implement storage_proxy::mutate_atomically, which in turn means
roughly the same as mutate, with write/remove from the batchlog table
intermixed.
This patch restructures some stuff in storage_proxy to avoid to much code
duplication, with the assumption (amongst other) that dead nodes will be few
etc."
Currently, each column family creates a fiber to handle compaction requests
in parallel to the system. If there are N column families, N compactions
could be running in parallel, which is definitely horrible.
To solve that problem, a per-database compaction manager is introduced here.
Compaction manager is a feature used to service compaction requests from N
column families. Parallelism is made available by creating more than one
fiber to service the requests. That being said, N compaction requests will
be served by M fibers.
A compaction request being submitted will go to a job queue shared between
all fibers, and the fiber with the lowest amount of pending jobs will be
signalled.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
storage_service is a singleton, and wants a database for initialization.
On the other hand, database is a proper object that is created and
destroyed for each test. As a result storage_service ends up using
a destroyed object.
Work around this by:
- leaking the database object so that storage_service has something
to play with
- doing the second phase of storage_service initialization only once
We will invoke the schema builder from schema_tables.cc, and at that point, the
information about compact storage no longer exists anywhere. If we just call it
like this, it will be the same as calling it with compact_storage::no, which
will trigger a (wrong) recomputation for compact_storage::yes CFs
The best way to solve that, is make the compact_storage parameter mandatory
every time we create a new table - instead of defaulting to no. This will
ensure that the correct dense and compound calculation are always done when
calling the builder with a parameter, and not done at all when we call it
without a parameter.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
When a schema is available, we use it. However, we have, by now, way too many
tests. Some of them use tables for which we don't even know the schema. It would
have been a massive amount of work to require a schema for all of them - so I am
keeping both constructors around.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Add a test case that triggers the heap overflow fixed in previous commit
("bytes_ostream: Fix current_space_left()").
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
This reverts commit 6b2be41df00bc42331eccd423b7031b345cf979d; tests should
work without a data directory, so let's find why they don't and fix it
instead.
It is needed for db.get_version(). I really hated to pass &db everywhere
If we had a global helper function like get_local_db(), life will be much
easier.
From Pawel:
This series fixes SELECT DISTINC statements. Previously, we relied on the
existance of static row to get proper results. That obviously doesn't work
when there is no static row in the partition. The solution for that is
to introduce new option to partition_slice: distinct which informs that
the only important information is static row and whether the partition
exists.
In preparation for adding listener state to migration manager, use
sharded<> for migration manager.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
This patch introduce init.cc file which hosts all the initialization
code. The benefits are 1) we can share initialization code with tests
code. 2) all the service startup dependency / order code is in one
single place instead of everywhere.
"range::is_wrap_around() will not work with current ring_position, because it
relies on total ordering. Same for range::contains(). Currently ring_position
is weakly ordered. This series fixes this problem by making ring_position
totally ordered.
Another problem fixed by this series is handling of wrap-around ranges. In
Origin, ]x; x] is treated as a wrap around range covering whole ring."
If we revert the type of the clustering key, which is what would happen if we
defined the table as with clustering order by (cl desc), we expect the
clustering keys to be in descending order on disk.
There is no work needed for sstables for that to happen. But we should still
verify that this is indeed the case.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
range::is_wrap_around() and range::contains() rely on total ordering
on values to work properly. Current ring_position_comparator was only
imposing a weak ordering (token positions equal to all key positions
with that token).
range::before() and range::after() can't work for weak ordering. If
the bound is exclusive, we don't know if user-provided token position
is inside or outside.
Also, is_wrap_around() can't properly detect wrap around in all
cases. Consider this case:
(1) ]A; B]
(2) [A; B]
For A = (tok1) and B = (tok1, key1), (1) is a wrap around and (2) is
not. Without total ordering between A and B, range::is_wrap_around() can't
tell that.
I think the simplest soution is to define a total ordering on
ring_position by making token positions positioned either before or
after all keys with that token.
"This is my current proposal for Compact Storage tables - plus
the needed infrastructure.
Getting rid of the CellName abstraction allows us to simplify
things by quite a lot: now all we need is to mark whether or
not a table is composite, and provide functions to play the
role of the comparator when dealing with the strings."
After commit 67f4b55b16 "gms/gossiper: Fix is_gossip_only_member() logic",
storage_service is needed by gossip.
To fix, start storage_service in the test. Also, improve the
indentation.
"These patches fix more problems related to ORDER BY clauses.
Firstly, mutation_partition::query() can now return rows in reveresed
order which makes it easy for select statements to ask for data from
single partition with clustering keys in reversed order even if limit
of rows is set.
That alone is not sufficient, though, if the request contains IN clause
on partition keys and number of returned rows is limited. The information
needed to determine which rows need to be in the reply isn't available
before post-query sort is done, so select statement asks for more rows
than the limit and trims the output later."
Since the read tests are validated using Origin-generated tables, our
write test will just write to the tables and make sure we can read them
back ok.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
I would like to use them from sstable_datafile_test.cc to make sure that
the tables we write are really correct.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Test the read of a table that is not compound NOR dense.
Dense tables will be handled later.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Technically, this is not needed for the existing schemas: the builder is only
necessary when we are setting properties. For consistency, however, let's
convert them all.
Soon we will have some schemas that will set properties. In particular, compact
storage.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>