when_all() tests generate two million calls to BOOST_REQUIRE(), which
overwhelms the test result parser. Replace with two calls and use all_of()
to process the result array.
Includes adaptation by Nadav for the removal of file_input_stream:
sstables.cc used file_input_stream, which we replaced by the new
make_file_input_stream. We also couldn't make sstables.cc read either
a file_input_stream or the planned compressed_file_input_stream.
So in this patch we implement an API similar to the old "file_input_stream"
based on the new make_file_input_stream. file_input_stream now has a parent
class "random_access_reader", preparing for a future patch to support both
file_input_stream and compressed_file_input_stream in the same code - by making
all the parsers take a random_access_reader reference instead of file_input_stream.
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
The file_input_stream interface was messy: it was not fiber safe (e.g., code
doing seek() in the middle of an ongoing read_exactly()), and went against
the PIMPL philosophy.
So this patch removes the file_input_stream class, and replaces it with a
completely different design:
We now have in fstream.hh a global function:
input_stream<char>
make_file_input_stream(
lw_shared_ptr<file> file, uint64_t offset = 0,
uint64_t buffer_size = 8192);
In other words, instead of "seeking" in an input stream, we just open a new
input stream object at a particular offset of the given file. Multiple input
streams might be concurrently active on the same file.
Note how make_file_input_stream now returns a regular "input_stream", not a
subtype, and it can be used just like any normal input_stream to read the stream
starting at the given position.
This patch makes "input_stream" a "final" type: we no longer subclass it in our
code, and we shouldn't in the future because it goes against the PIMPL design
(the subclass should be of the inner workings, like the data_source_impl, not
of input_stream).
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
This patch adds some of the common functionalities from std:string to
sstring.
It adds length (implement by size() )
It define the constant npos to indicate no possition.
It adds the at (reference and const reference)
It define the find char and find sstring methods
and the substr method
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
need merge sstring
engine() cannot be called before the local_engine was constructed because it
dereferences the pointer local_engine to create a reference.
Consequently, the following error can be seen while running fileiotest:
./core/reactor.hh:854:13: runtime error: reference binding to null pointer of
type 'struct reactor' ASAN:SIGSEGV
Let's switch the test to use app_template.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Implements a cassandra-file-compatible segmented log
of "mutations". Handles "batch" and "periodic" mode like
stock version. Also includes "dirty" management for the
interaction log/memtable.
Supports:
* add
* sync/flush
* clear dirty bits (thus discarding segments)
Many more estoric stock functions not yet implemented.
Missing: Storage management. Does not deal with total
size on disk of segments yet. Nor does it have any provisions
for dealing with active buffer bloat should async writes stall.
[avi: adjust for future<>::rescue() removal]
Signed-off-by: Calle Wilund <calle@cloudius-systems.com>
Basic explanation of the reclaimer algorithm:
- Each slab page has a descriptor containing information about it, such as
refcnt, vector of free objects, link into LRU, etc.
- The LRU list will only contain slab pages which items are unused, so as
to make the reclaiming process faster and easier. Maintaining the LRU of slab
pages has a performance penalty of ~1.3%. Shlomi suggested an approach where
LRU would no longer exist and timestamp would be used instead to keep track of
recency. Reclaimer would then iterate through all slab pages checking for an
unused slab page with the lowest timestamp.
- Reclaimer will get the least-recently-used slab page from the LRU list,
do all the management stuff required, and iterate through the page erasing any
of the items there contained. Once reclaimer was called, it's likely that slab
memory usage is calibrated, thus slab pages shouldn't be allocated anymore.
- Reclaimer is enabled by default but can be disabled by specifying the slab
size using the application parameter --max-slab-size.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
We use bytes for many different things, and it is easy to get confused as
to what format the data is actually in.
Fix that for atomic_cell by proving wrappers. atomic_cell::one corresponds
to a bytes object holding exactly one atomic cell, and atomic_cell::view is
a bytes_view to an atomic_cell. The static functions of atomic_cell itself
are privatized to prevent the unwashed masses from using them on the wrong
objects.
Since a row entry can hold either a an atomic cell, or a collection,
depending on the schema, also introduce a variant type
atomic_cell_or_collection and allow the user to pick the type explicitly.
Internally both are stored as bytes object.
Tomek points out that:
Origin calls org.apache.cassandra.utils.Hex#hexToBytes here, which is
not what to_bytes() does. BytesType.getSerializer().toString() calls
ByteBufferUtil.bytesToHex(value), so you should call to_hex() here.
Fix that up.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Storing cells as boost::any objects makes us use expensive
boost::any_cast to access the data. This change replaces boost::any
with bytes object which holds the value in serialized form (the same
as will be used for on-wire format).
If the cell type is atomic, you use fields accessors defined in
atomic_cell class, eg like this:
if (column.type.is_atomic()) {
if (atomic_cell::is_live(c) {
auto timestamp = atomic_cell::timestamp(c);
...
}
}
Eventually we could switch to a more officient semi-serialized form
with native byte order but I don't want to introduce it just yet for
simplicity.
This tests the sstable parsing code. The tables provided as exemple are real
tables, that I was using to test my code manually.
The corrupt tables, however, are corrupted by hand.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
It is sometimes frustrating to use open_file_dma, because it has the hardcoded
behavior of always assuming O_CREAT. Sometimes this is not desirable, and it
would be nice to have the option not to do so.
Note that, by design, I am only including in the open_flags enum things that we
want the user of the API to control. Stuff like O_DIRECT should not be
optional, and therefore is not included in the visible interface.
Because of that I am changing the function signature to include a paramater
that specifies whether or not we should create the file if it does not exist.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Add a performance test case for CQL statement parsing to better
understand its performance impact. We also include ANTLR tokenizer and
parser setup as that's what we do in query_processor for each request.
Running the test on my Haswell machine yields the following results:
[penberg@nero urchin]$ build/release/tests/perf/perf_cql_parser
Timing CQL statement parsing...
108090.10 tps
125366.11 tps
124400.64 tps
124274.75 tps
124850.85 tps
That means that CQL parsing alone sets an upper limit of 120k requests
per second for Urchin for a single core.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Add database::shard_of() to compute the shard hosting the partition
(with a simplistic algorithm, but perhaps not too bad).
Convert non-metadata invoke_on_all() and local calls on the database
to use shard_of().