Commit Graph

4699 Commits

Author SHA1 Message Date
Avi Kivity
8a50b3d9ba main: be more explicit about make directory errors 2015-07-12 19:38:59 +03:00
Glauber Costa
322b1c30dd main: make sure data directory exists before proceeding
Right now when we initiate the database, we exist with just an exception if the
data directory does not exist. That does not tell much to the user about what
is going on.

It would be nice to at the very least catch the exception and turn it into a
user friendly message. But we can obviously do much better and create the
directory.

If we fail, then we can capture and tell the user why.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-12 19:14:06 +03:00
Avi Kivity
d9a4645e24 Merge seastar upstream 2015-07-12 19:13:52 +03:00
Avi Kivity
2a622304ed file: make files copyable
Often a single file is used in multiple fibers, and so it is wrapped in a
lw_shared_ptr.  Remove the need for this by making files internally reference
counted.
2015-07-12 18:11:46 +03:00
Glauber Costa
50e2f24dfa core: add recursive_touch directory
Often, when we create a directory, it is useful to make sure that the path
leading to it exists.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-12 18:05:08 +03:00
Avi Kivity
8c7bcf52e2 Merge seastar upstream 2015-07-12 17:06:14 +03:00
Avi Kivity
c84b6a948f Merge "fstream optimizations"
file::size() was excessively slow, and fstreams were needlessly calling
it.  Optimize file::size(), and also make fstream not call it.
2015-07-12 16:56:01 +03:00
Avi Kivity
35f29dcdb4 fstream: add read-until-end tests without knowning the file size beforehand 2015-07-12 16:52:24 +03:00
Avi Kivity
7684f2f541 Merge "Add ByteOrderedPartitioner" from Paweł
This patch series adds partial implementation of ByteOrderedPartitioner
and allows choosing it in configuration file.
While ByteOrderedPartitioner is generally not recommended it is used by
some tests in DTEST due to its order guarantees.
2015-07-12 16:20:19 +03:00
Gleb Natapov
83bbd87966 storage_proxy: simplify net::messaging_verb::READ_DATA handler 2015-07-12 16:19:20 +03:00
Gleb Natapov
67d494c4ee storage_proxy: fix waiting_for to check for local dataceneter 2015-07-12 16:19:20 +03:00
Paweł Dziepak
351b113913 dht: allow configuration file to choose partitioner
Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
2015-07-12 15:14:53 +02:00
Avi Kivity
82b621b857 Merge "Fix insert statement to preperly handle tables without clustering key" from Tomasz 2015-07-12 16:14:13 +03:00
Tomasz Grabiec
18c48dbb14 tests: Add test for inserting a row without setting any regular column 2015-07-12 15:04:38 +02:00
Tomasz Grabiec
28c1c3eaf1 tests: Use 'update' instead of 'insert' in test_ttl
We don't really handle expiration of entrie rows yet, due to issue #17.

Fixing a bug in 'insert' made this test start to fail because the
expired row started to appear in the results. Until #17 is fixed,
let's use 'update' to create the row, so that the expectations are
met.
2015-07-12 15:04:38 +02:00
Tomasz Grabiec
05e2e9d5ea cql: Fix 'insert' statement not creating row marker for non-clustered tables
Fixes #18.

The problem was that the row entry was not getting created for tables
without clustering key. The empty prefix was mistakenly taken as a
belonging to a static row.
2015-07-12 15:04:38 +02:00
Avi Kivity
ddc98ac45d Merge "Mutation query interface"
"Node interface is:

   storage_proxy::query_mutations_locally()

Shard interface is:

   database::query_mutations()

Query results are returned in a form of reconcilable_result object, which
contains a vector of frozen_mutation."
2015-07-12 15:17:37 +03:00
Tomasz Grabiec
674dfdcf25 tests: Test consistent ordering of partitions in range queries
In Origin, partitions in range query results are ordered using
decorated_key ordering. This allows the use of token() function for
incremental iterating over results, as mentioned here:

http://www.datastax.com/dev/blog/client-side-improvements-in-cassandra-2-0
2015-07-12 12:54:39 +02:00
Tomasz Grabiec
e35854b33c storage_proxy: Preserve partition order in range queries
In Origin, partitions in range query results are ordered using
decorated_key ordering. This allows the use of token() function for
incremental iterating over results, as mentioned here:

http://www.datastax.com/dev/blog/client-side-improvements-in-cassandra-2-0

We may also need this to implement paging.

The old code didn't preserve ordering, because it didn't merge-sort
data coming from different shards. The fix relies on
query_mutations_locally(), which already preserves the ordering. We're
going to use mutation queries for range queries anyway.
2015-07-12 12:54:38 +02:00
Tomasz Grabiec
ad99e84505 storage_proxy: Take schema_ptr in query()
It will be needed for reconciliation.
2015-07-12 12:54:38 +02:00
Tomasz Grabiec
5b332c6f7c messaging_service: Add RPC call for mutation queries. 2015-07-12 12:54:38 +02:00
Tomasz Grabiec
4931d3fdcd storage_proxy: Implement query_mutations_locally() 2015-07-12 12:54:38 +02:00
Tomasz Grabiec
82a61e92fa messaging_service: Use generic serializer adaptor for frozen_mutation 2015-07-12 12:54:38 +02:00
Tomasz Grabiec
0573844deb messaging_service: Add adaptors for types having db::serializer<> 2015-07-12 12:54:38 +02:00
Tomasz Grabiec
e865740abe messaging_service: Extract integral reading logic 2015-07-12 12:54:38 +02:00
Tomasz Grabiec
9f789602e8 db/serializer: Implement read() with output parameter variant for frozen_mutation 2015-07-12 12:54:38 +02:00
Tomasz Grabiec
2ae6e91956 keys: Make comparators work on views rather than const&
More generic.
2015-07-12 12:54:38 +02:00
Tomasz Grabiec
6f5c00c515 tests: Add test for mutation queries 2015-07-12 12:54:38 +02:00
Tomasz Grabiec
9bea6aa0a3 db: Introduce mutation query interface
Mutation query differs from data query in that returns information
needed to reconcile data slice with that retruned by other data
sources.

There is a generic mutation_query() algorithm introduced, which can
work with any mutation_source.

database::query_mutations() is a shard-local interface for mutation
queries.

The reconcilable_result is introduced as a medium for mutation query
results. It piggy backs on frozen_mutation as a medium for
reconcilable data.
2015-07-12 12:51:38 +02:00
Tomasz Grabiec
8ba0d6729e frozen_mutation: Add copy constructor and assignment operator
Also add missing move assignemnt operator.
2015-07-12 12:51:38 +02:00
Tomasz Grabiec
7356024dde mutation_partition: Introduce compact_for_query()
Prepares the partition to be returned in a mutation query:
 - throws out data which doesn't belong to row_ranges
 - expires cells based on query_time
 - drops cells covered by higher-level tombstones (compaction)
 - leaves at most row_limit live rows

Until we have a fine-grained reader, it's best to perform these on an
existing object. Later we can do it on-the-fly. Based on Origin's
org.apache.cassandra.db.filter.SliceQueryFilter#collectReducedColumns
2015-07-12 12:51:38 +02:00
Tomasz Grabiec
4a1bf56b48 types: Introduce collection_type_impl::mutation::compact_and_expire()
Will be needed by mutation query.
2015-07-12 12:51:38 +02:00
Avi Kivity
9c6dd8b724 Merge "Fix CQL index name case sensitivity" from Pekka
"Fix index name case sensitivity as per Origin commit 68be72f ("Fix
case-sensitivity of index name on CREATE/DROP INDEX") which addressed
CASSANDRA-8365.

Please note that the imported code is from "cassandra-2.2.0-rc2" tag
that points to the following commit:

  ebc50d783505854f04f183297ad3009b9095b07e"

Reviewed-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-12 12:45:26 +03:00
Avi Kivity
98c38a7c4b Merge "Query path bug fixes and cleanups" from Tomasz 2015-07-12 11:41:54 +03:00
Raphael S. Carvalho
d3a83aa549 sstables: finish streaming_histogram::update
This method was incomplete, and thus would fail if map size were
greater than max_bin_size, bringing the application down.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Reviewed-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-12 11:06:03 +03:00
Tomasz Grabiec
7827d89443 Merge tag 'schema_properties-v2' from git@github.com:glommer/urchin.git schema_properties-v2
From Glauber:

"In order to make describe tables work, we need to wire up a lot of options that cqlsh will expect
to be present, and represent the status of the table.

The options that I am wiring up here are ones that are more or less "self-suficient", in the sense
that we don't need anything else outside the schema itself - that we don't have - to decide.

Even is_super(), for instance: while marking a table as super has consequences, code in Origin is
expect to explicitly set that value: we don't have to derive it. So we can, for now, just expose it
(the same way we do for is_dense), and worry about setting it later.

After this patchset, describing a particular table still does not work, *BUT*,
it does if write synthetic values for the compaction string/options. I am not
going to worry about it now, because Raphael is about to merge some patches
that introduce the classes I need anyway.

But with those patches here + the compaction data, then it all works"
2015-07-10 19:40:08 +03:00
Glauber Costa
fe370ec848 schema: read_repair_chance
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-10 10:29:02 -04:00
Glauber Costa
ea17f6d76f schema: max and min index interval
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-10 10:29:02 -04:00
Glauber Costa
aa270a149f schema: compaction strategy
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-10 10:29:02 -04:00
Glauber Costa
b75bd9ef53 schema: add local_repair_chance parameter
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-10 10:29:02 -04:00
Glauber Costa
8218c819a5 schema: access gc_grace_seconds
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-10 10:29:02 -04:00
Glauber Costa
f84148c335 schema: support cf type, and is_super
All CFMetaData has a type, either Standard or Super. Right now, we do not
support Super, but we still would like to query for it, and use that information
to build our schemas.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-10 10:29:02 -04:00
Pekka Enberg
c820078916 cql3/Cql.g: Fix index name case sensitivity
Fix index name case sensitivity as per Origin commit 68be72f ("Fix
case-sensitivity of index name on CREATE/DROP INDEX") which addressed
CASSANDRA-8365.

The grammar rules were imported from the following Origin commit:

  ebc50d783505854f04f183297ad3009b9095b07e

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-10 11:16:42 +03:00
Glauber Costa
c73d4cba16 system_keyspaces: update hints dropped
Again, not terribly complicated.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-10 10:09:00 +02:00
Glauber Costa
0bf09f6af8 system_keyspace: implement update_preferred_ip
This one is quite simple, we just need very basic translation

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-10 10:09:00 +02:00
Glauber Costa
fe154efffe system_keyspace: implement (remote) update_tokens
There are two versions of update_tokens: one for the tokens used by this node,
which goes to the local table, and another for the remote tokens, used by
remote nodes, which goes to the peers table.

The former was implemented, the latter was not. Implement it.

One node: Origin does not issue a flush here, at least in the version of the
code we imported. However, a flush is present in all other variants, and won't
hurt, aside from creating an extra, probably very small, sstable. So I'm
flushing.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-10 10:09:00 +02:00
Glauber Costa
31f4601329 system_keyspace: remove duplication for blocking flush
We ended up with two different implementations of force_blocking_flush,
none of them ideal.

This patch merges both in one that makes more sense, getting rid of the
duplication.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-10 10:09:00 +02:00
Tomasz Grabiec
d96c2ef308 Merge branch 'pdziepak/fix-map-to-json/v1' from seastar-dev.git
From Pawel:

"These patches fix two issues with std::map to json. Firstly, an empty
map was serialized to 'null' instead of '{}'. Secondly, a newline was
needlessly added at the end of generated string."
2015-07-10 10:59:29 +03:00
Pekka Enberg
7e31dfa31f cql3: Convert CFName to C++
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-10 10:49:30 +03:00
Pekka Enberg
006a188a05 cql3: Import CFName.java
Origin commit: ebc50d783505854f04f183297ad3009b9095b07e

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-07-10 10:42:35 +03:00