This patchset adds a simple metadata and data store, and wires the
system_add_keyspace RPC to set it up.
With this, cassandra-stress is able to initialize the database (but not
store anything in it).
Simplistic database using std::map<> to hold rows, and boost::any to
hold values.
Supports:
- multiple key spaces
- multiple column families
- a few data types
Does not support:
- container data types
- secondary indexes
- composites
- validators
The byte-order functions were changed not to do in-place conversions,
but they still accept non-const inputs, although they do not modify them.
This can make them harder to use in some cases.
Fix by marking the inputs const.
Provide a function that maps packet's rss hash to a cpu that should handle
it. This function is needed to find appropriate src port for outgoing
tcp/udp connection. Use this function to forward de-fragmented ip packet
to avoid one extra hop too.
Add a space after the "Checking link status" to prevent it from
merging with "done" if the link is up immediatelly.
For instance this is going to be the case for a VF
of a PF with already established link (e.g. on AWS).
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
We assume that if Rx IPv4, TCP and UDP checksum offload features are suported then
they are supported or not supported all together. The same is about the Tx UDP and TCP
checksum offload.
Add the assert that check this assumption.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Even if port has a single queue we still want the RSS feature to be
available in order to make HW calculate RSS hash for us.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
DPDK 1.8 provides per-device default Tx and Rx queues configurations in the output
of rte_eth_dev_info_get(). Use them instead of ixgbe tuned hardcoded values.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
- Rename: init_port() -> init_port_start().
- Added a function init_port_fini() that has a code originally found flat in
init_local_queue().
- Moved the link state check to init_port_fini() since the link state should
be checked after the port has been started.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
1) Make --dpdk-pmd parameter to be a flag instead of a (key, value).
2) Default to a default hugetlbfs DPDK settings when --hugepages is not
given and --dpdk-pmd is set.
This will allow a more friendly user experience in general and when one doesn't
want to provide a --hugepages parameter in particular.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
- Move the smp::dpdk_eal_init() code into the dpdk::eal::init() where it belongs.
- Removed the unused "opts" parameter of dpdk::dpdk_device constructor - all its usage
has been moved to dpdk::eal::init().
- Cleanup in reactor.cc: #if HAVE_DPDK -> #ifdef HAVE_DPDK; since we give a -DHAVE_DPDK
option to a compiler.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
This patchset adds support for an asynchronous thrift service serving the
Cassandra interface.
The implementation uses the "continuation object style" thrift code generation
option, which can be readily adapted to the promise/future interface.
With this, one can connect with a client and see this:
$ ./cassandra-stress insert -node localhost
Exception in thread "main" java.lang.RuntimeException: org.apache.thrift.TApplicationException: sorry, not implemented
at org.apache.cassandra.stress.settings.StressSettings.getRawThriftClient(StressSettings.java:139)
at org.apache.cassandra.stress.settings.StressSettings.getRawThriftClient(StressSettings.java:109)
at org.apache.cassandra.stress.settings.SettingsSchema.createKeySpacesThrift(SettingsSchema.java:112)
at org.apache.cassandra.stress.settings.SettingsSchema.createKeySpaces(SettingsSchema.java:60)
at org.apache.cassandra.stress.settings.StressSettings.maybeCreateKeyspaces(StressSettings.java:200)
at org.apache.cassandra.stress.StressAction.run(StressAction.java:57)
at org.apache.cassandra.stress.Stress.main(Stress.java:109)
Caused by: org.apache.thrift.TApplicationException: sorry, not implemented
at org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
at org.apache.cassandra.thrift.Cassandra$Client.recv_set_cql_version(Cassandra.java:1896)
at org.apache.cassandra.thrift.Cassandra$Client.set_cql_version(Cassandra.java:1883)
at org.apache.cassandra.stress.settings.StressSettings.getRawThriftClient(StressSettings.java:128)
... 6 more
As thrift does not support pipelining, the server is very simple. It
implements the thrift framed transport, where each message is preceded
by a four-byte message size header.
Where possible, throw an exception instead of returning an uninitialized
value.
Where not possible (if the method does not throw), return a "dummy" string.
Support adding a thrift file as a source. Since thrift generates multiple
output files, whose names cannot be trivially derived from the source file
name, we have to specify it as an object containing the source file name
and any additional information needed to derive the generated file names
(in this case, the generated thrift services).
The generic thrift headers bring in a #define (yuch) named VERSION, while the
Cassandra interface also defines a symbol with the same name.
Rename the symbol to avoid a compile conflict.
DPDK initialization creates its own threads and assumes that application
uses them, otherwise things do not work correctly (rte_lcore_id()
returns incorrect value for instance). This patch uses DPDK threads to
run seastar main loop making DPDK APIs work as expected.
register_poller() (and unregister_poller()) adjusts _pollers, but it may be
called while iterating it, and since std::vector<> mutations invalidate
iterators, corruption occurs.
Fix by deferring manipulation of _pollers into a task, which is executed at
a time where _pollers is not touched.
Currently, reactor::_pollers holds reactor::poller pointers; since these
are movable types, it's hard to maintain _pollers, as the pointers can keep
changing.
Refactor poller so that _pollers points at an internal type, which does not
move when a reactor::poller moves. This requires getting rid of
std::function, since it lacks a comparison operator.
When we have an object acting as resource guard for memory, we can convert
it into a deleter using
make_deleter([obj = std::move(obj)] {})
introduce a simpler interface
make_object_deleter(std::move(obj))
for doing the same thing.
Some (all?) RSS capable HW provides us with a hash that was used to
select rx queue the packet was delivered to. If such hash is available
it is better to use it to forward packet instead of calculating hash
ourself and suffering cache missed.
This patch introduce a logic to divide cpus between available hw queue
pairs. Each cpu with hw qp gets a set of cpus to distribute traffic
to. The algorithm doesn't take any topology considerations into account yet.
Instead of forward() deciding packet destination make it collect input
for RSS hash function depending on packet type. After data is collected
use toeplitz hash function to calculate packet's destination.