engine() cannot be called before the local_engine was constructed because it
dereferences the pointer local_engine to create a reference.
Consequently, the following error can be seen while running fileiotest:
./core/reactor.hh:854:13: runtime error: reference binding to null pointer of
type 'struct reactor' ASAN:SIGSEGV
Let's switch the test to use app_template.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Implements a cassandra-file-compatible segmented log
of "mutations". Handles "batch" and "periodic" mode like
stock version. Also includes "dirty" management for the
interaction log/memtable.
Supports:
* add
* sync/flush
* clear dirty bits (thus discarding segments)
Many more estoric stock functions not yet implemented.
Missing: Storage management. Does not deal with total
size on disk of segments yet. Nor does it have any provisions
for dealing with active buffer bloat should async writes stall.
[avi: adjust for future<>::rescue() removal]
Signed-off-by: Calle Wilund <calle@cloudius-systems.com>
Basic explanation of the reclaimer algorithm:
- Each slab page has a descriptor containing information about it, such as
refcnt, vector of free objects, link into LRU, etc.
- The LRU list will only contain slab pages which items are unused, so as
to make the reclaiming process faster and easier. Maintaining the LRU of slab
pages has a performance penalty of ~1.3%. Shlomi suggested an approach where
LRU would no longer exist and timestamp would be used instead to keep track of
recency. Reclaimer would then iterate through all slab pages checking for an
unused slab page with the lowest timestamp.
- Reclaimer will get the least-recently-used slab page from the LRU list,
do all the management stuff required, and iterate through the page erasing any
of the items there contained. Once reclaimer was called, it's likely that slab
memory usage is calibrated, thus slab pages shouldn't be allocated anymore.
- Reclaimer is enabled by default but can be disabled by specifying the slab
size using the application parameter --max-slab-size.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
We use bytes for many different things, and it is easy to get confused as
to what format the data is actually in.
Fix that for atomic_cell by proving wrappers. atomic_cell::one corresponds
to a bytes object holding exactly one atomic cell, and atomic_cell::view is
a bytes_view to an atomic_cell. The static functions of atomic_cell itself
are privatized to prevent the unwashed masses from using them on the wrong
objects.
Since a row entry can hold either a an atomic cell, or a collection,
depending on the schema, also introduce a variant type
atomic_cell_or_collection and allow the user to pick the type explicitly.
Internally both are stored as bytes object.
Tomek points out that:
Origin calls org.apache.cassandra.utils.Hex#hexToBytes here, which is
not what to_bytes() does. BytesType.getSerializer().toString() calls
ByteBufferUtil.bytesToHex(value), so you should call to_hex() here.
Fix that up.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Storing cells as boost::any objects makes us use expensive
boost::any_cast to access the data. This change replaces boost::any
with bytes object which holds the value in serialized form (the same
as will be used for on-wire format).
If the cell type is atomic, you use fields accessors defined in
atomic_cell class, eg like this:
if (column.type.is_atomic()) {
if (atomic_cell::is_live(c) {
auto timestamp = atomic_cell::timestamp(c);
...
}
}
Eventually we could switch to a more officient semi-serialized form
with native byte order but I don't want to introduce it just yet for
simplicity.
This tests the sstable parsing code. The tables provided as exemple are real
tables, that I was using to test my code manually.
The corrupt tables, however, are corrupted by hand.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
It is sometimes frustrating to use open_file_dma, because it has the hardcoded
behavior of always assuming O_CREAT. Sometimes this is not desirable, and it
would be nice to have the option not to do so.
Note that, by design, I am only including in the open_flags enum things that we
want the user of the API to control. Stuff like O_DIRECT should not be
optional, and therefore is not included in the visible interface.
Because of that I am changing the function signature to include a paramater
that specifies whether or not we should create the file if it does not exist.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Add a performance test case for CQL statement parsing to better
understand its performance impact. We also include ANTLR tokenizer and
parser setup as that's what we do in query_processor for each request.
Running the test on my Haswell machine yields the following results:
[penberg@nero urchin]$ build/release/tests/perf/perf_cql_parser
Timing CQL statement parsing...
108090.10 tps
125366.11 tps
124400.64 tps
124274.75 tps
124850.85 tps
That means that CQL parsing alone sets an upper limit of 120k requests
per second for Urchin for a single core.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Add database::shard_of() to compute the shard hosting the partition
(with a simplistic algorithm, but perhaps not too bad).
Convert non-metadata invoke_on_all() and local calls on the database
to use shard_of().
s/database/distributed<database>/ everywhere.
Use simple distribution rules: writes are broadcast, reads are local.
This causes tremendous data duplication, but will change soon.
It is based on tcp_client and works with our httpd server.
1) timer based, to run the test for 10 seconds
$ http_client --server 192.168.66.100:10000 --conn 100 --duration 10 --smp 2
========== http_client ============
Server: 192.168.66.100:10000
Connections: 100
Requests/connection: dynamic (timer based)
Requests on cpu 0: 33400
Requests on cpu 1: 33368
Total cpus: 2
Total requests: 66768
Total time: 10.011478
Requests/sec: 6669.145442
========== done ============
2) nr of reqs per connection based, to run the test with 100 connections
each has to run 1000 reqs
$ http_client --server 192.168.66.100:10000 --conn 100 --reqs 1000 --smp 2
========== http_client ============
Server: 192.168.66.100:10000
Connections: 100
Requests/connection: 1000
Requests on cpu 0: 50000
Requests on cpu 1: 50000
Total cpus: 2
Total requests: 100000
Total time: 15.002731
Requests/sec: 6665.453192
========== done ============
This patch is based on Shlomi's initial version.
Signed-off-by: Shlomi Livne <shlomi@cloudius-systems.com>
Signed-off-by: Asias He <asias@cloudius-systems.com>