Allocate exactly the available fragment size in order to catch buffer
overflows.
We get similar behaviour in dpdk, since without huge pages, it must copy
the packet into a newly allocated buffer.
Two bugs:
1. get_header<type>(offset) was used with size given as the offset
2. opt_end failed to account for the mandatory tcp header, and thus was
20 bytes to large, resulting in overflow.
SFINAE only works for substituted template parameters, not any complication
error (or it would be called CEINAE); therefore hash<T> for enums will fail
to compile, given a non-enum, rather than being ignored.
It's not possible to specialize hash<> for enums, since the primary template
does not have en extra Enable template argument for use with enable_if. We
therefore rename it to enum_hash<> and require users to explicitly define
hash<MyEnum> as inheriting from it.
We currently only signal eof for consume() users. If one is calling
read_exactly, eof will never be signalled.
This might make some sense in continuous streams, but specially when
dealing with files, eof is a natural part of line that can and will
happen all the time. Every "read-until-finish" file-loop will at some
point rely on eof.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
boost::split() expects either a NUL terminated string or a proper container.
We give it neither.
Fix by wrapping the buffer in a string_view, which tells split() what size
the string is.
Storing cells as boost::any objects makes us use expensive
boost::any_cast to access the data. This change replaces boost::any
with bytes object which holds the value in serialized form (the same
as will be used for on-wire format).
If the cell type is atomic, you use fields accessors defined in
atomic_cell class, eg like this:
if (column.type.is_atomic()) {
if (atomic_cell::is_live(c) {
auto timestamp = atomic_cell::timestamp(c);
...
}
}
Eventually we could switch to a more officient semi-serialized form
with native byte order but I don't want to introduce it just yet for
simplicity.
Since types can be embedded at any position in memory we cannot assume
alignment.
Side note: It seems that on x86 access to the variable via packed<>
does not result in any extra instructions.
Sstable support from Glauber:
"These are the new sstable parsers, based on recursive templates.
They currently read the following files successfully:
- Filter
- Statistics
- Compression
- Summary (sans the indices, which are schema dependent)
The TOC file is also parsed, but slightly different because it contains
a newline separated list of strings. The CRC and Digest files are not
yet read, but they are only used for consistency checking, which can wait.
Aside from that, only missing the Data and Index files themselves."
This tests the sstable parsing code. The tables provided as exemple are real
tables, that I was using to test my code manually.
The corrupt tables, however, are corrupted by hand.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
This is a parser for the sstable files based on template recursion.
With this, one can easily extend it to parse any complex data structure
by writing
parse(in, struct.a, struct.b ... );
This patch contains the most basic types used during parsing as building
blocks.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
This represents an sstable with its multiple in-disk file. We don't yet
read any of them, though.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>
boost::join() provided by boost/algorithm/string.hpp conflicts with
boost::join() from boost/range/join.hpp. It looks like a boost issue
but let's not pollute the namespace unnecesssarily.
Regarding the change in configure.py, it looks like scollectd.cc is
part of the 'core' package, but it needs 'net/api.hh', so I added
'net/net.cc' to core.
It is sometimes frustrating to use open_file_dma, because it has the hardcoded
behavior of always assuming O_CREAT. Sometimes this is not desirable, and it
would be nice to have the option not to do so.
Note that, by design, I am only including in the open_flags enum things that we
want the user of the API to control. Stuff like O_DIRECT should not be
optional, and therefore is not included in the visible interface.
Because of that I am changing the function signature to include a paramater
that specifies whether or not we should create the file if it does not exist.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Only partially translated. I had to comment out some "static"
specifications to avoid compiler warnings because these are not used yet.
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
Signed-off-by: Tomasz Grabiec <tgrabiec@cloudius-systems.com>
Add a performance test case for CQL statement parsing to better
understand its performance impact. We also include ANTLR tokenizer and
parser setup as that's what we do in query_processor for each request.
Running the test on my Haswell machine yields the following results:
[penberg@nero urchin]$ build/release/tests/perf/perf_cql_parser
Timing CQL statement parsing...
108090.10 tps
125366.11 tps
124400.64 tps
124274.75 tps
124850.85 tps
That means that CQL parsing alone sets an upper limit of 120k requests
per second for Urchin for a single core.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>