Commit Graph

5690 Commits

Author SHA1 Message Date
Paweł Dziepak
0afbbb9d44 cql3: fix empty IN () restriction
Values inside IN () restrictions may be either in a vector _in_values or
a marker (_in_marker or _value). To determine which one is appropriate
we check whether _in_values is empty, which is wrong because IN clause
can be empty (and there is no marker in such case). This is fixed by
using the presence of a marker to determine whether a vector of values
or a marker should be used.

Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
2015-08-13 10:45:27 +02:00
Calle Wilund
562fa1a726 Disable allocation failure test in debug/sanitizer build
Since sanitizer does not fail gracefully on over-alloc
2015-08-12 20:00:44 +03:00
Avi Kivity
95847f86c3 Merge "locator: introduce i_endpoint_snitch::reset_snitch()" from Vlad
"This series introduces the i_endpoint_snitch::reset_snitch() static method
that allows to replace the current (global) snitch instance with the new one.
This is done in an (per-shard) atomic way transparent so anyone holding a reference
to snitch_ptr.

This series starts with some cleanups, adds the above method and the unit test
that verifies its functionality."
2015-08-12 19:29:08 +03:00
Avi Kivity
20e88a6f92 Merge seastar upstream
* seastar 4e35b8d...b56a6eb (1):
  > httpd: fix future exception handling
2015-08-12 18:34:09 +03:00
Avi Kivity
517ceed515 Merge "sstable index write benchmark"
"I am currently looking at the performance of our index_read, since it was in
the past pinpointed at the source of problems.

While the read side is the one that is mostly interesting, I would like to test
both - besides anything else, it is easier to test reads after writes so we
don't have to create synthetic data with outside tools.

This patch introduces the write side benchmark (read side will hopefully come
tomorrow).  While the write side is, as mentioned, not the most interesting
part, I did see some standing from the flamegraph that allowed me to optimize
one particular function, yielding a 8.6 % improvement."
2015-08-12 18:33:11 +03:00
Avi Kivity
bb83ba860a Merge seastar upstream
* seastar 7e7cef2...4e35b8d (3):
  > future: improve handle_exception
  > memory: attempt to catch underflows
  > memory: replace assert on too-large allocations with bad_alloc
2015-08-12 18:21:42 +03:00
Avi Kivity
11bf4efc72 Merge "Some changes to deal with allocation failures in CL" from Calle
"Related to 108
Does not fix the problem (fully at least), but at least:
* Throws exceptions instead of crashing
* Tries to back off slighly (allocate less) if possible
* Logs it

Also recycles segments to keep them from being fragmented by mem system"
2015-08-12 17:47:25 +03:00
Nadav Har'El
75384413f3 repair: fix use of handle_exception()
handle_exception() should really discard the future's value automatically,
and in an upcoming version of Seastar, won't. So instead of

	sp.execute().handle_exception(...)

(where execute() returns a future which is *not* future<>)
We need to write

	sp.execute().discard_result().handle_exception(...)

This already works in today's Seastar (the extra discard_result()
doesn't cause any harm), and will be necessary when handle_exception()
in Seastar is improved (I'll send a patch soon).

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-08-12 17:46:41 +03:00
Glauber Costa
e1968c389e dht: use tri_compare for token comparisons
Loading data from memory tends to be the most expensive part of the comparison
operations. Because we don't have a tri_compare function for tokens, we end up
having to do an equality test, which will load the token's data in memory, and
then, because all we know is that they are not equal, we need to do another
one.

Having two dereferences is harmful, and shows up in my simple benchmark. This
is because before writing to sstables, we must order the keys in decorated key
order, which is heavy on the comparisons.

The proposed change speeds up index write benchmark by 8.6%:

Before:
41458.14 +- 1.49 partitions / sec (30 runs)

After:
45020.81 +- 3.60 partitions / sec (30 runs)

Parameters:
--smp 6 --partitions 500000

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-12 09:23:42 -05:00
Calle Wilund
47b7314c78 Commitlog: add test for too large alloc 2015-08-12 16:20:12 +02:00
Calle Wilund
2db7791c6a Commitlog: Attempt to reduce allocation size for segment if alloc fails 2015-08-12 16:20:12 +02:00
Calle Wilund
4fe98d3acf Commitlog: Throw bad_alloc on memalign fail (avoid sigsegv later) 2015-08-12 16:20:11 +02:00
Calle Wilund
7191a130bb Commitlog: recycle buffers to reduce fragmentation. 2015-08-12 16:20:11 +02:00
Glauber Costa
4ddef06ba6 perf tests: test sstables index reads and writes
This is a test that allow us to query the performance of our sstable index
reads and writes (currently only writes implemented). A lot of potentially
common code is put into a header, which will make writing new tests easier if
needed.

We don't want to take shortcuts for this, so all reading and writing is done
through public sstable interfaces.

For writing, there is no way to write the index without writing the datafile.
But because we are only writing the primary key, the datafile will not contain
anything else. This is the closest we can get to an index testing with the
public interfaces.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-12 09:18:37 -05:00
Glauber Costa
07eb98e799 tests: enhance _remove so it also removes directory structures
if a directory is found, recursively delete it. This will be useful for
allowing the creation of test structures like test/cpuX/sstable

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-12 09:17:37 -05:00
Glauber Costa
fa4cbe4844 tests: allow one to specify the directory in which test sstables will be created
Our normal test directory may not be good enough for performance testing. The
reason is, that while our git tree with its relative path will usually be
sitting in a standard ext4 filesystem, we want the performance tests to be run
against XFS, which is our deployment target.

It is a lot easier to point the perf test to an already mounted xfs directory,
than to meddle with mounts into the codebase's relative path for this alone.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-12 09:17:09 -05:00
Glauber Costa
da3cd1dc6a tests: expose create directory function
In some situations, it is useful to have the test directory persistent. To do that,
expose the inner function that creates it.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-12 09:12:59 -05:00
Gleb Natapov
987bf33865 storage_proxy: cleanup commented origin code
Remove code that was already reimplemented. Makes file navigation much
easier.
2015-08-12 16:50:57 +03:00
Vlad Zolotarov
806cc8c09a locator: snitch_reset_test
Checks that both successful and insuccessful calls for reset_snitch()
function as expected.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-08-12 16:44:47 +03:00
Vlad Zolotarov
6bffb9232e locator: added i_endpoint_snitch::reset_snitch()
Resets the global snitch with the new value

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-08-12 16:44:43 +03:00
Vlad Zolotarov
ef1c7deff4 locator: introduce i_endpoint_snitch::pause_io() and resume_io() methods
resume_io() is different from start() in that it won't try to read to configuration
and will only restart the periodic I/O task (if any).

This also means that resume_io() may not fail while start() will return an
exceptional future if it fails to read the configuration.

pause_io() is a counterpart of resume_io() - it stops the periodic I/O task (if any).
After it returns a ready future - snitch will not try to read any configuration until
either start() or resume_io() are called.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-08-12 16:38:04 +03:00
Vlad Zolotarov
b9389d4907 locator: Get rid of using the global distributed<snitch_ptr> from inside the snitch
- production_snitch_base: Store a distributed object pointer in the snitch.
   - i_endpoint_snitch::init_snitch_obj(): Set the distributed<> mentioned above.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-08-12 16:37:57 +03:00
Vlad Zolotarov
08ccffc701 locator::i_endpoint_snitch: added init_snitch_obj()
Initializes the given distributed<snitch_ptr> object but
not start()s the local snitch instances.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-08-12 16:33:26 +03:00
Vlad Zolotarov
f9b67f60c2 locator::i_endpoint_snitch: Remove not used _snitch_is_ready promise
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-08-12 16:21:25 +03:00
Vlad Zolotarov
a3cda17bfb locator::gossiping_property_file_snitch: simplify the start()
- Call for read_property_file() directly from the start().
   - Immediately return ready future from the start() for non-IO
     CPUs.
   - Remove the not needed invoke_on_all() invocations.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-08-12 16:21:18 +03:00
Vlad Zolotarov
4eeed09572 tests: gossiping_property_file_snitch_test: stop() the distributed in an error case
If snitch has been created while it had to fail we have to stop the
global (distributed) snitch in order to avoid the assert.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-08-12 16:19:06 +03:00
Vlad Zolotarov
0b17f2ad75 locator: simple and rack_inferring snitches: add "override" qualifier to get_name() method
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-08-12 16:19:06 +03:00
Vlad Zolotarov
c9f9d8164e locator::gossiping_property_file_snitch: make get_name() public as it should be
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-08-12 16:19:06 +03:00
Vlad Zolotarov
31dbca22d7 locator: snitch_base.hh: cleanups
- Fixed a typo.
   - snitch_ptr: make operator= return a reference to the parent object.
   - i_endpoint_snitch: set the _state in a default start() implementation.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-08-12 16:18:58 +03:00
Gleb Natapov
d8dd5a9c01 Remove unneeded namespace qualifiers. 2015-08-12 16:12:10 +03:00
Gleb Natapov
ea2632e15b Move overloaded_exception to exceptions.hh 2015-08-12 16:12:09 +03:00
Gleb Natapov
0b3d2de2f1 Fix mutation write timeout exception reporting
Make it compatible with CQL specification
2015-08-12 14:58:48 +03:00
Avi Kivity
149b08889a Merge "gossip cleanup" from Asias 2015-08-12 12:40:23 +03:00
Asias He
0f2ea6d7c0 gossip: Remove one TODO for SHUTDOWN handling 2015-08-12 17:35:46 +08:00
Asias He
c72c96f8aa gossip: Remove a outdated TODO
It is fixed already.
2015-08-12 17:35:46 +08:00
Asias He
5831dcba28 gossip: Add error handling for GOSSIP_SHUTDOWN and GOSSIP_DIGEST_ACK verb 2015-08-12 17:35:46 +08:00
Avi Kivity
b38d6b8132 Merge "storage_service update" from Asias
"I'm leaving the following functions

   get_saved_tokens()
   get_bootstrap_state()
   set_bootstrap_state()

for people more familiar with execute_cql."
2015-08-12 11:55:22 +03:00
Avi Kivity
c453a2075a Merge 2015-08-12 11:49:20 +03:00
Avi Kivity
c90e3c4bb2 Merge "CQL server cleanups" from Pekka
"Cleanups to the CQL server implementation. The biggest change is moving
event notifier to a separate source file in an attempt to make server.cc
smaller and more modularized."
2015-08-12 11:42:04 +03:00
Avi Kivity
ecc3ccc716 lsa: emergency segment reserve for compaction
To free memory, we need to allocate memory.  In lsa compaction, we convert
N segments with average occupancy of (N-1)/N into N-1 new segments.  However,
to do that, we need to allocate segments, which we may not be able to do
due to the low memory condition which caused us to compact anyway.

Fix by introducing a segment reserve, which we normally try to ensure is
full.  During low memory conditions, we temporarily allow allocating from
the emergency reserve.
2015-08-12 11:29:09 +03:00
Asias He
74e2f0156a messaging_service: Ignore cpu id for shard_id hash
Since we do not support shard to shard connections at the moment, ip
address should fully decide if a connection to a remote node exists or
not. messaging_service maintains connections to remote node using

   std::unordered_map<shard_id, shard_info, shard_id::hash> _clients;

With this patch, we can possibly reduce number of tcp connections
between two nodes.
2015-08-12 10:25:06 +03:00
Pekka Enberg
5c99a58b27 transport: Move cql_server and event_notifier to transport namespace
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-12 09:59:36 +03:00
Pekka Enberg
3ab87c216e transport/server: Use pragma once as include guard
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-12 09:59:36 +03:00
Pekka Enberg
a3c194b050 transport/server: Move event_notifier class to separate file
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-12 09:59:35 +03:00
Pekka Enberg
3ad4e4e829 transport/server: Move connection class definition to server.hh
Move the connection class to server.hh so that we can move event
notifier implementation to a separate source file.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-12 09:59:09 +03:00
Pekka Enberg
42f865a3de transport/server: Clean up connection class definition
Move member function definitions outside of the class definition in
preparation for moving the latter to a header file.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-12 09:25:09 +03:00
Glauber Costa
480d2c6d3e tests: move directory creation code to header
So we can use it in tests other than the main sstable one

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-11 23:37:06 -05:00
Asias He
dd34f4b0a4 storage_service: Enable update_local_tokens 2015-08-12 08:02:07 +08:00
Asias He
ce927105d8 db/system_keyspace: Implement update_local_tokens 2015-08-12 07:50:26 +08:00
Asias He
b3f7507e0a storage_service: Enable gossiper.replacement_quarantine in handle_state_normal 2015-08-12 07:50:26 +08:00