Currently, the code is using bytes_opt and bytes_view_opt to represent
CQL values, which can hold a value or null. In preparation for
supporting a third state, unset value introduced in CQL v4, introduce
new raw_value and raw_value_view types and use them instead.
The new types are based on boost::variant<> and are capable of holding
null, unset values, and blobs that represent a value.
Benoît Canet points out that CQL messages are not always compressed
although compression is enabled by the driver. Turns out our CQL
compression negotiation is broken. We need to negotiate compression upon
STARTUP message and not rely on the incoming request to have the
compression bit enabled.
Fixes#1680
Message-Id: <1474366693-3001-1-git-send-email-penberg@scylladb.com>
In names of functions and variables:
s/flush_/write_/
s/store_/write_/
In a i_tracing_backend_helper:
s/flush()/kick()/
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
- Store a trace state inside a client_state.
- Start tracing in a cql_server::connection::process_query().
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
- Add a tracing ID (UUID) optional field to cql_server::response.
- If _tracing_id is set make_frame() would insert a tracing ID
in the response message. According to CQL spec it should be the
first thing in the response "body" and the TRACING bit (0x02) should be
set in the "flags" field.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
The default CQL frame compression algorithm in Cassandra is LZ4. Add
support for decompressing incoming frames and compressing outgoing
frames with LZ4 if the CQL driver asks for that.
Fixes#416
Message-Id: <1464086807-11325-1-git-send-email-penberg@scylladb.com>
As explained in commit 0ff0c55 ("transport: server: 'short' should be
unsigned"), "short" type is always unsigned in the CQL binary protocol.
Therefore, drop the read_unsigned_short() variant altogether and just
use read_short() everywhere.
Message-Id: <1456133171-1433-1-git-send-email-penberg@scylladb.com>
Fixes#563.
Refs #584
CQLv2 encodes batch query_options in v1 format, not v2+.
CQLv1 otoh has no batch support at all.
Make read_options use explicit version format if needed.
v2: Ensure we preserve cql protocol version in query_opts
Message-Id: <1454514510-21706-1-git-send-email-calle@scylladb.com>
Replicates https://issues.apache.org/jira/browse/CASSANDRA-7910 :
"Prepare a statement with a wildcard in the select clause.
2. Alter the table - add a column
3. execute the prepared statement
Expected result - get all the columns including the new column
Actual result - get the columns except the new column"
If requests are delayed downstream from the cql server, and the client is
able to generate unrelated requests without limit, then the transient memory
consumed by the requests will overflow the shard's capacity.
Fix by adding a semaphore to cap the amount of transient memory occupied by
requests.
Fixes#674.
Optional credentials argument determine if SSL or normal
server socket is created.
Note: This does not follow the pattern of "socket as argument", simply
because this is a distributed object, so only trivial or immutable
objects should be passed to it.
This patch plus pekka's previous commit 3c72ea9f96
"gms: Fix gossiper::handle_major_state_change() restart logic"
fix CASSANDRA-7816.
Backported from:
def4835 Add missing follow on fix for 7816 only applied to
cassandra-2.1 branch in 763130bdbde2f4cec2e8973bcd5203caf51cc89f
763130b Followup commit for 7816
2199a87 Fix duplicate up/down messages sent to native clients
Tested by:
pushed_notifications_test.py:TestPushedNotifications.restart_node_test
In preparation for processing queries on shards other than where the
connection lives in, merge client state changes in process_request().
Signed-off-by: Pekka Enberg <penberg@scylladb.com>
In preparation for spreading request processing to multiple cores, make
sure CQL response is written out on the connection shard.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Instead of flushing responses immediately, ask a reactor poller to flush
them for us. This lets several responses to be flushed out together in
one packet.
Store values as bytes view when possible. This improves the CQL protocol
option parsing path by avoiding allocating memory and copying individual
values as "bytes" objects.
Please note that we retain the non-view version for internal queries
where performance is not as important.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Move the connection class to server.hh so that we can move event
notifier implementation to a separate source file.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
All sharded services "should" define a stop method. Calling them is also
a good practice.
Blindly calling it at exit is wrong, but it is less wrong than not calling
it at all, and makes it now equally as wrong as any of the other services.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
This patch adds initial support for PREPARE and EXECUTE requests which
are used by the CQL binary protocol for prepared statements. The use of
prepared statement gives a nice 2.5x single core performance boost for
Urchin:
$ ./build/release/seastar --data data --smp 1
$ ./tools/bin/cassandra-stress write -mode cql3 simplenative -rate threads=32
Results:
op rate : 31728
partition rate : 31728
row rate : 31728
latency mean : 1.0
latency median : 0.9
latency 95th percentile : 1.8
latency 99th percentile : 1.8
latency 99.9th percentile : 5.6
latency max : 181.7
Total operation time : 00:00:30
END
$ ./tools/bin/cassandra-stress write -mode cql3 simplenative prepared -rate threads=32
Results:
op rate : 75033
partition rate : 75033
row rate : 75033
latency mean : 0.4
latency median : 0.4
latency 95th percentile : 0.7
latency 99th percentile : 0.8
latency 99.9th percentile : 3.4
latency max : 205.0
Total operation time : 00:00:30
END
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
s/database/distributed<database>/ everywhere.
Use simple distribution rules: writes are broadcast, reads are local.
This causes tremendous data duplication, but will change soon.