Commit Graph

53948 Commits

Author SHA1 Message Date
Nadav Har'El
75384413f3 repair: fix use of handle_exception()
handle_exception() should really discard the future's value automatically,
and in an upcoming version of Seastar, won't. So instead of

	sp.execute().handle_exception(...)

(where execute() returns a future which is *not* future<>)
We need to write

	sp.execute().discard_result().handle_exception(...)

This already works in today's Seastar (the extra discard_result()
doesn't cause any harm), and will be necessary when handle_exception()
in Seastar is improved (I'll send a patch soon).

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-08-12 17:46:41 +03:00
Glauber Costa
e1968c389e dht: use tri_compare for token comparisons
Loading data from memory tends to be the most expensive part of the comparison
operations. Because we don't have a tri_compare function for tokens, we end up
having to do an equality test, which will load the token's data in memory, and
then, because all we know is that they are not equal, we need to do another
one.

Having two dereferences is harmful, and shows up in my simple benchmark. This
is because before writing to sstables, we must order the keys in decorated key
order, which is heavy on the comparisons.

The proposed change speeds up index write benchmark by 8.6%:

Before:
41458.14 +- 1.49 partitions / sec (30 runs)

After:
45020.81 +- 3.60 partitions / sec (30 runs)

Parameters:
--smp 6 --partitions 500000

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-12 09:23:42 -05:00
Calle Wilund
47b7314c78 Commitlog: add test for too large alloc 2015-08-12 16:20:12 +02:00
Calle Wilund
2db7791c6a Commitlog: Attempt to reduce allocation size for segment if alloc fails 2015-08-12 16:20:12 +02:00
Calle Wilund
4fe98d3acf Commitlog: Throw bad_alloc on memalign fail (avoid sigsegv later) 2015-08-12 16:20:11 +02:00
Calle Wilund
7191a130bb Commitlog: recycle buffers to reduce fragmentation. 2015-08-12 16:20:11 +02:00
Glauber Costa
4ddef06ba6 perf tests: test sstables index reads and writes
This is a test that allow us to query the performance of our sstable index
reads and writes (currently only writes implemented). A lot of potentially
common code is put into a header, which will make writing new tests easier if
needed.

We don't want to take shortcuts for this, so all reading and writing is done
through public sstable interfaces.

For writing, there is no way to write the index without writing the datafile.
But because we are only writing the primary key, the datafile will not contain
anything else. This is the closest we can get to an index testing with the
public interfaces.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-12 09:18:37 -05:00
Glauber Costa
07eb98e799 tests: enhance _remove so it also removes directory structures
if a directory is found, recursively delete it. This will be useful for
allowing the creation of test structures like test/cpuX/sstable

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-12 09:17:37 -05:00
Glauber Costa
fa4cbe4844 tests: allow one to specify the directory in which test sstables will be created
Our normal test directory may not be good enough for performance testing. The
reason is, that while our git tree with its relative path will usually be
sitting in a standard ext4 filesystem, we want the performance tests to be run
against XFS, which is our deployment target.

It is a lot easier to point the perf test to an already mounted xfs directory,
than to meddle with mounts into the codebase's relative path for this alone.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-12 09:17:09 -05:00
Glauber Costa
da3cd1dc6a tests: expose create directory function
In some situations, it is useful to have the test directory persistent. To do that,
expose the inner function that creates it.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-12 09:12:59 -05:00
Gleb Natapov
987bf33865 storage_proxy: cleanup commented origin code
Remove code that was already reimplemented. Makes file navigation much
easier.
2015-08-12 16:50:57 +03:00
Vlad Zolotarov
806cc8c09a locator: snitch_reset_test
Checks that both successful and insuccessful calls for reset_snitch()
function as expected.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-08-12 16:44:47 +03:00
Vlad Zolotarov
6bffb9232e locator: added i_endpoint_snitch::reset_snitch()
Resets the global snitch with the new value

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-08-12 16:44:43 +03:00
Vlad Zolotarov
ef1c7deff4 locator: introduce i_endpoint_snitch::pause_io() and resume_io() methods
resume_io() is different from start() in that it won't try to read to configuration
and will only restart the periodic I/O task (if any).

This also means that resume_io() may not fail while start() will return an
exceptional future if it fails to read the configuration.

pause_io() is a counterpart of resume_io() - it stops the periodic I/O task (if any).
After it returns a ready future - snitch will not try to read any configuration until
either start() or resume_io() are called.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-08-12 16:38:04 +03:00
Vlad Zolotarov
b9389d4907 locator: Get rid of using the global distributed<snitch_ptr> from inside the snitch
- production_snitch_base: Store a distributed object pointer in the snitch.
   - i_endpoint_snitch::init_snitch_obj(): Set the distributed<> mentioned above.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-08-12 16:37:57 +03:00
Vlad Zolotarov
08ccffc701 locator::i_endpoint_snitch: added init_snitch_obj()
Initializes the given distributed<snitch_ptr> object but
not start()s the local snitch instances.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-08-12 16:33:26 +03:00
Vlad Zolotarov
f9b67f60c2 locator::i_endpoint_snitch: Remove not used _snitch_is_ready promise
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-08-12 16:21:25 +03:00
Vlad Zolotarov
a3cda17bfb locator::gossiping_property_file_snitch: simplify the start()
- Call for read_property_file() directly from the start().
   - Immediately return ready future from the start() for non-IO
     CPUs.
   - Remove the not needed invoke_on_all() invocations.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-08-12 16:21:18 +03:00
Vlad Zolotarov
4eeed09572 tests: gossiping_property_file_snitch_test: stop() the distributed in an error case
If snitch has been created while it had to fail we have to stop the
global (distributed) snitch in order to avoid the assert.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-08-12 16:19:06 +03:00
Vlad Zolotarov
0b17f2ad75 locator: simple and rack_inferring snitches: add "override" qualifier to get_name() method
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-08-12 16:19:06 +03:00
Vlad Zolotarov
c9f9d8164e locator::gossiping_property_file_snitch: make get_name() public as it should be
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-08-12 16:19:06 +03:00
Vlad Zolotarov
31dbca22d7 locator: snitch_base.hh: cleanups
- Fixed a typo.
   - snitch_ptr: make operator= return a reference to the parent object.
   - i_endpoint_snitch: set the _state in a default start() implementation.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-08-12 16:18:58 +03:00
Gleb Natapov
d8dd5a9c01 Remove unneeded namespace qualifiers. 2015-08-12 16:12:10 +03:00
Gleb Natapov
ea2632e15b Move overloaded_exception to exceptions.hh 2015-08-12 16:12:09 +03:00
Gleb Natapov
0b3d2de2f1 Fix mutation write timeout exception reporting
Make it compatible with CQL specification
2015-08-12 14:58:48 +03:00
Amnon Heiman
773106b90e API: add get estimated row size histogram to column family
This adds the implementation to in the API to the row size histogram.

It adds a map_cf method that perform a map operation over all column
family on the different shards.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-08-12 13:10:18 +03:00
Amnon Heiman
c0a52a28bc Adding the read latency support to the storage proxy
This adds the latency histogram support to the storage_proxy.

It uses a the latency object to mark the opetation latency, if there
will be an impact on performance, it can be changed from all operations
to sample of the operation.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-08-12 13:10:18 +03:00
Amnon Heiman
0ca7189664 API: Adding the estimated_histogram to the utils definition file
This adds the estimated_histogram to the utils definition file.

The estimated_histogram holds a list of buckets and a list of buckets
offsets.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-08-12 13:10:18 +03:00
Amnon Heiman
ae34ba32fa API: Adding min row and max row support to column_family
This adds the implementation for min and max row size in column family.

It uses the column family map redudce helper function with the addtional
function to get the min and max row size.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-08-12 13:10:18 +03:00
Amnon Heiman
ba5b1db618 API: Add a wrapper function for min and max
This helper function wraps the std min and max template for int64_t, it
makes it easier to pass them as a value in need.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-08-12 13:10:18 +03:00
Amnon Heiman
d97f9ea4c9 sstable add a getter for the sstable stats
This adds a getter function for the statistic of the sstable.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-08-12 13:10:18 +03:00
Amnon Heiman
13b6b0ce02 Cleaning the metadata_collector
This changes the constructor initilization of the metadata_collecr, it
would call the constructor directly without the java-like assignment.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-08-12 13:10:18 +03:00
Amnon Heiman
c2bb3f1c00 Cleaning the estimated_histogram
This do the following chagnes in the estimated_histogram, it uses
int64_t over unsigned to be compatible to origin and the API.

It adds a getter to the buckets and change the getteer to the
bucket_offset to be const.

It adds a get min and max similiar to origin. And it adds a merge
function to merge estimated histogram.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-08-12 13:10:06 +03:00
Avi Kivity
149b08889a Merge "gossip cleanup" from Asias 2015-08-12 12:40:23 +03:00
Asias He
0f2ea6d7c0 gossip: Remove one TODO for SHUTDOWN handling 2015-08-12 17:35:46 +08:00
Asias He
c72c96f8aa gossip: Remove a outdated TODO
It is fixed already.
2015-08-12 17:35:46 +08:00
Asias He
5831dcba28 gossip: Add error handling for GOSSIP_SHUTDOWN and GOSSIP_DIGEST_ACK verb 2015-08-12 17:35:46 +08:00
Avi Kivity
b38d6b8132 Merge "storage_service update" from Asias
"I'm leaving the following functions

   get_saved_tokens()
   get_bootstrap_state()
   set_bootstrap_state()

for people more familiar with execute_cql."
2015-08-12 11:55:22 +03:00
Avi Kivity
c453a2075a Merge 2015-08-12 11:49:20 +03:00
Avi Kivity
c90e3c4bb2 Merge "CQL server cleanups" from Pekka
"Cleanups to the CQL server implementation. The biggest change is moving
event notifier to a separate source file in an attempt to make server.cc
smaller and more modularized."
2015-08-12 11:42:04 +03:00
Avi Kivity
ecc3ccc716 lsa: emergency segment reserve for compaction
To free memory, we need to allocate memory.  In lsa compaction, we convert
N segments with average occupancy of (N-1)/N into N-1 new segments.  However,
to do that, we need to allocate segments, which we may not be able to do
due to the low memory condition which caused us to compact anyway.

Fix by introducing a segment reserve, which we normally try to ensure is
full.  During low memory conditions, we temporarily allow allocating from
the emergency reserve.
2015-08-12 11:29:09 +03:00
Asias He
74e2f0156a messaging_service: Ignore cpu id for shard_id hash
Since we do not support shard to shard connections at the moment, ip
address should fully decide if a connection to a remote node exists or
not. messaging_service maintains connections to remote node using

   std::unordered_map<shard_id, shard_info, shard_id::hash> _clients;

With this patch, we can possibly reduce number of tcp connections
between two nodes.
2015-08-12 10:25:06 +03:00
Pekka Enberg
5c99a58b27 transport: Move cql_server and event_notifier to transport namespace
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-12 09:59:36 +03:00
Pekka Enberg
3ab87c216e transport/server: Use pragma once as include guard
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-12 09:59:36 +03:00
Pekka Enberg
a3c194b050 transport/server: Move event_notifier class to separate file
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-12 09:59:35 +03:00
Pekka Enberg
3ad4e4e829 transport/server: Move connection class definition to server.hh
Move the connection class to server.hh so that we can move event
notifier implementation to a separate source file.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-12 09:59:09 +03:00
Amnon Heiman
0240080527 streaming_histogram modify the default constructor
The default constructor need to set the the max_bin size, so it was
combine with the non default one, with a default value.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-08-12 09:41:19 +03:00
Pekka Enberg
42f865a3de transport/server: Clean up connection class definition
Move member function definitions outside of the class definition in
preparation for moving the latter to a header file.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-12 09:25:09 +03:00
Glauber Costa
480d2c6d3e tests: move directory creation code to header
So we can use it in tests other than the main sstable one

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-11 23:37:06 -05:00
Asias He
dd34f4b0a4 storage_service: Enable update_local_tokens 2015-08-12 08:02:07 +08:00