Commit Graph

3759 Commits

Author SHA1 Message Date
Vlad Zolotarov
1e32bdf090 gms: added missing operator==() required for endpoint_state_map comparison.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-06-09 15:18:46 +03:00
Vlad Zolotarov
c1f0d285bb database: make the the create_keyspace() function declaration match the definitiion.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-06-09 15:18:46 +03:00
Avi Kivity
e07a1e2924 Merge seastar upstream 2015-06-09 12:54:50 +03:00
Avi Kivity
4bbd90d14c reactor: workaround missing FALLOC_FL_ZERO_RANGE in kernel headers
Prehistoric kernels don't expose FALLOC_FL_ZERO_RANGE, humor them.
2015-06-09 12:53:58 +03:00
Avi Kivity
7a464ddf99 reactor: batch aio
Instead of issuing a system call for every aio, wait for them to accumulate,
and issue them all at once.  This reduces syscall count, and allows the kernel
to batch requests (bu plugging the I/O queues during the call).  A poller is
added so that requests are not delayed too much.

Reviewed-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-06-09 12:52:40 +03:00
Raphael S. Carvalho
d1ed0744f0 schema: add sstable compressor property
The field compressor is about saying which compressor algorithm
must be used in compression of sstable data file.
This is a small step towards compressed sstable data file.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-06-09 11:18:56 +03:00
Shlomi Livne
bd89fa4905 config: add string_list (vec of sstring) as config data type + use for datadir
To handle the fact that --data-file-directories is supposed to be 1+
folders.

Note that boost::program_ops already "reserves" the use of std::vector
as reciever of values for multitoken options (i.e. those with more than
one value). Thus, values recieving a list of tokens via command line
should adhere to the multi-token rules, i.e. space separated values.

End result is that --data-file-directories now accept multiple paths,
white space separated,
i.e. --data-file-directories <path1> <path2>
And as it turns out, this is really a nicer way of writing stuff than
using "," or ":" seperation of paths etc, so...

Signed-off-by: Calle Wilund <calle@cloudius-systems.com>
2015-06-09 10:40:45 +03:00
Avi Kivity
551114586d Merge "Initial table merging"
Pekka says:

"This series is the initial table merging code conversion. We now store
column family metadata in the database but without information about the
actual columns."
2015-06-09 10:39:54 +03:00
Avi Kivity
7f7381dc1e Merge seastar upstream 2015-06-09 08:45:25 +03:00
Avi Kivity
44e35ef545 fstream: preallocation support for file output stream
Preallocate disk blocks in advance of writing.

Reviewed-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-06-09 08:44:59 +03:00
Avi Kivity
3ab32ae7a1 file: add allocate() method
Allocate disk blocks in advance of writing to them.

Reviewed-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-06-09 08:44:53 +03:00
Avi Kivity
9ac5880e76 Merge "rework sstable write functions
From Glauber:

"Up until now, we were still generating one future per element that we write.
Now that we have new infrastructure, we can avoid that, and generate only the
ones we really need to. This has the added advantage of lifting the need to do
lambda captures and allowing for a more straightfoward forwarding of rest...
parameters"
2015-06-08 16:13:53 +03:00
Glauber Costa
a076ea563b rework write functions
Up until now, we were still generating one future per element that we write.
Now that we have new infrastructure, we can avoid that, and generate only the
ones we really need to. This has the added advantage of lifting the need to do
lambda captures and allowing for a more straightfoward forwarding of rest...
parameters

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-06-08 15:25:35 +03:00
Glauber Costa
2dbd2b408a sstables: change describe_type's return type to auto
We always return a future, but with the threaded writer, we can get rid of
that. So while reads will still return a future, the writer will be able to
return void.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-06-08 15:25:35 +03:00
Paweł Dziepak
bfe6446a89 class_registrator: make no_such_class message more informative
Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
2015-06-08 15:24:14 +03:00
Pekka Enberg
d25bd89ee1 db/legacy_schema_tables: Convert table merging to C++
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-06-08 14:59:06 +03:00
Pekka Enberg
87e525b6b5 database: Add update and drop column family stubs
They're needed by table merging in db/legacy_schema_tables.cc.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-06-08 14:42:36 +03:00
Pekka Enberg
ab6dbc0d83 query-result-set: Add rows() accessor function
Add rows() accessor function for iterating over the whole result set.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-06-08 14:42:36 +03:00
Pekka Enberg
5df4b51589 schema: Add comparison operators for column_definition and schema
Table merging code needs to compare schema_ptrs for equality so add
comparison operators for column_definition and schema classes.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-06-08 14:42:36 +03:00
Avi Kivity
b96f2d4cee Merge seastar upstream 2015-06-08 12:02:05 +03:00
Raphael S. Carvalho
d864da71fc core: avoid fsyncing output stream twice
For some reason, I added a fsync call when the file underlying the
stream gets truncated. That happens when flushing a file, which
size isn't aligned to the requested DMA buffer.
Instead, fsync should only be called when closing the stream, so this
patch changes the code to do that.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-06-08 11:59:22 +03:00
Avi Kivity
a0d98f89c4 Merge "sstable fixes, collections, and range tombstones"
Glauber says:

"The current series fixes a small conversion bug and brings some much needed
cleanups (like in key.cc).

With that in place, it implements and wires support for collections and range
tombstones."
2015-06-08 10:16:52 +03:00
Pekka Enberg
9bcb590efe cql3: Fix column identifier lookups
We're looking up shared_ptr<column_identifier> type so make sure we
lookup by value, not by pointer.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-06-08 09:23:33 +03:00
Avi Kivity
e343295667 commitlog: don't pass a temporary string to std::regex_match
The match results will point nowhere, and libstdc++ 5 rightly rejects it.
2015-06-08 09:23:18 +03:00
Glauber Costa
5401604fb4 do not pass schema_ptr to sstable write functions
We did so because passing shared pointers in the old code was so much easier.
But it is no longer, so we can avoid the reference bump.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-06-08 01:51:44 +03:00
Glauber Costa
d59f65eb91 sstables: test range tombstones
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-06-08 01:47:13 +03:00
Glauber Costa
b9b233071c sstables: wire up range tombstones
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-06-08 01:47:12 +03:00
Glauber Costa
26fd9fed14 sstable: test collections
Test writing collections, also includes static collections.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-06-08 01:47:12 +03:00
Glauber Costa
31548df661 sstables: write static collections
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-06-08 01:47:12 +03:00
Glauber Costa
2853fcd5c5 sstables: write collections
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-06-08 01:47:12 +03:00
Glauber Costa
08f761800a sstables: write a range tombstone
This is the code to write a range tombstone. This is not yet wired up to actually do it.
The use case for collections is a lot simpler, and will be handled first.

The actual code should be virtually identical, though.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-06-08 01:47:12 +03:00
Glauber Costa
f1bda42537 sstable: update marker in composite
We can insert markers in the end of composites, which can be used to identify
the presence of ranges in a column.

One option, would be to change all methods in sstables/key.hh to take an
optional marker parameter, and append that as the last marker.

But because we are talking about a single byte, and always added to the end,
it's a lot easier to allow the composite to be created normally, and then
replace the last byte with the marker.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-06-08 01:47:11 +03:00
Glauber Costa
b38748032a sstables: clean up sstable components code
Tomek got confused with the fact that we had to pass bytes_type for this code
to work. And well, that's understandable: that code evolved quite a bit since
its first user, and now the interface is not quite the best for the job,
forcing us to employ weird tricks like that for the code to work.

In this cleanup, I am creating a serializer object, that will encode
information about how to serialize the component passed. In the majority of the
cases, a simple sstable_serializer will just serialize to itself - accepting as
parameters byte_views.

In the case we need to operate on a deeply exploded view - the only case for
which we truly needed types, the respective serializer will take a types vector
and use it accordingly.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-06-08 01:47:11 +03:00
Glauber Costa
f7b9977830 sstables: extend composite::from_clustering_key
Make sure it is also suitable to operate on a clustering_key_prefix

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-06-08 01:47:11 +03:00
Glauber Costa
df8b8823c2 correctly position end of row
In a testament to how confusing our old code was, while collapsing the futures
I ended up getting the end_of_row element inside the loop for clustered keys.
The end of row, as the name implies, should only be written at the end of the
(thrift) row (= CQL partition).

Move it to the right place.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-06-08 01:47:11 +03:00
Glauber Costa
0f0721af1f sstables: remove circular reference
writer.hh includes sstables.hh which includes writer.hh
We can't remove the reference if we include core/fstream.hh into writer.hh instead

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-06-07 22:50:07 +03:00
Tomasz Grabiec
9bbd790641 Merge branch 'glommer/threads' from git@github.com:glommer/urchin.git
From Glauber:

"This my attempt to convert the sstable write code to seastar threads.
The code does look a lot cleaner now, and the future path on how to improve
it, a bit more clear."
2015-06-07 17:01:12 +02:00
Avi Kivity
80fa0bb868 Merge "CQL: Fix create table statement"
From Pekka:

"This series fixes create table statement to actually specify columns.
Note that clustering keys are not yet supported."
2015-06-07 15:05:21 +03:00
Avi Kivity
89115e8da2 tests: whitelist thread_test 2015-06-07 14:31:26 +03:00
Glauber Costa
5503e140d5 do not write stack variables
This code is blatantly wrong, because it writes stack variables to the
underlying storage.

After this patch, the code is no longer wrong. Right is better than wrong,
so we should apply this.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-06-07 10:38:26 +03:00
Glauber Costa
90a82eeb85 dowithy non dowithed code
Technically speaking, the current code is not wrong. However, it was written
before we had do_with, and I ended up dowithing it while chasing our erratic
bug under the suspicion that this code could somehow be related with our bug.

Turns out it isn't, but now that I went through the trouble of dowithing it -
and since do_with is easier to reason about and guarantee liveness, let's go
with this option.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-06-07 10:38:25 +03:00
Glauber Costa
fb4676cfd1 remove all the t functions
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-06-07 10:38:25 +03:00
Glauber Costa
1b76d20563 write all the components
checksum and crc are written inside the main function so we don't need
to export the file stream. But since the functions are actually trivial
we can just .get() the whole thing instead of changing them.

The others are still kept as futures and called after async::thread completes,
for maximum parallelism.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-06-07 10:38:25 +03:00
Glauber Costa
a22d0a9e5c convert to threads up to clustered rows
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-06-07 10:38:25 +03:00
Glauber Costa
d0760fe5b2 convert to threads up to static columns writes
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-06-07 10:38:25 +03:00
Glauber Costa
7073639201 sstables: get rid of write_static_column_name
After the last round of cleanups, this function turned out to be exactly the
same as write_column_name, except that the composite differs. Because the
composite is passed as a parameter, we can just use the same function for all
and pre-create the composite.

This will make the implementation of collections a lot easier, since for
collections we will prepend each element with the column name.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-06-07 10:38:25 +03:00
Glauber Costa
9ee588bc49 convert the code that writes the partition key to file to threads
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-06-07 10:38:25 +03:00
Glauber Costa
a1f4eb9601 convert write_index_entry to thread
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-06-07 10:38:24 +03:00
Glauber Costa
3bbfd75e60 convert first part to thread
Those functions don't actually use any futures, so they are really just a copy.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-06-07 10:38:24 +03:00
Glauber Costa
e7305a4a58 convert sstable write path to seastar threads
This is my attempt to convert the sstable code to seastar threads, as
we have been extensively discussing. I haven't yet measured how much do we
gain by this, but the final code looks *so* much better and less complicated,
that this alone should be enough reasoning.

Here's how I've done it, so you can easily follow:

Every function that we use and returns a future, is copied to another function
with the same name but ending in _t. This is better than copying the whole thing,
because it can be done in logical pieces that are easier to follow. This is also
easier to verify.

function_t() will do the same as function(), but will return void.

I am not changing more than I need to, so in the final code, without all the
do_withs and other stuff, there are some parts that start to cry for a cleanup.
They are left as is for now, and I will return to them later once the patch is
merged.

In this initial patch, you get the main write_components converted, and this
nice explanation message.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-06-07 10:38:24 +03:00