Commit Graph

130 Commits

Author SHA1 Message Date
Tomasz Grabiec
aec740f895 db: Make decorated_key have ordering compatible with Origin 2015-04-30 12:02:39 +02:00
Calle Wilund
2f4e7a00f6 Use db/config object in main, database etc
* Uses config object to augument/impl options parsing
* Database now holds config obj
* Commitlog can now be inited with global config obj.
2015-04-29 18:01:17 +02:00
Tomasz Grabiec
2693dd2c7b db: Extract bytes related stuff from database.cc to bytes.cc
Some tests (eg murmur_hash_test) need only byte manipulation
functions. By specifying dependencies precisely we can drastically
reduce recompilation times, which speeds up development cycle.

I managed to reduce recompilation time for murmur_hash_test from 5
minutes to 4 seconds by breaking dependency on whole urchin object
set.
2015-04-29 15:50:16 +03:00
Avi Kivity
6290dee438 db: const correctness for abstract_type and friends
Types are immutable.
2015-04-29 15:40:38 +03:00
Avi Kivity
3162873d7f Merge branch 'calle/commitlog' of github.com:cloudius-systems/seastar-dev into db
Use commit log in database, from Calle:

"Initial" usage of the commitlog in database mutation path.
A commitlog is created in "work" dirs when initing the db
from a datadir. However, since we have neither disk data storage,
nor replay capability yet (and no real db config), the settings
are basically to just write in-memory serialization, write them to
disk and then discard them. So in fact, pointless. But at least using
the log...
2015-04-29 11:28:05 +03:00
Calle Wilund
aeb83f2874 Add commitlog to db + use it in storage_proxy/handler
* A commitlog is created in "work" dirs when initing the db
  from a datadir. However, since we have neither disk data storage,
  nor replay capability yet (and no real db config), the settings 
  are basically to just write in-memory serialization, write them to 
  disk and then discard them. So in fact, pointless. But at least using
  the log...
* Moved the actual "apply" of mutation into database. If a commitlog
  is active, add an entry to it before applying mutation.
2015-04-29 10:10:21 +02:00
Pekka Enberg
33ceac5643 database: add database::delete_keyspace() stub
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-04-28 15:49:33 +03:00
Pekka Enberg
cf1d6197d6 database: add database::update_keyspace() stub
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-04-27 11:39:57 +03:00
Tomasz Grabiec
5a7e3d3278 db: Order partitions by decorated_key
Partitions should be ordered using Origin's ordering, which is first
by token, then by Origin's representation of the key. That is the
natural ordering of decorated_key.

This also changes mutation class to hold decorated_key, to avoid
decoration overhead at different layers.
2015-04-24 18:01:01 +02:00
Tomasz Grabiec
1c3275c950 mutation: Encapsulate fields 2015-04-24 18:01:01 +02:00
Tomasz Grabiec
4641bc6f95 database: Move implementation to source file 2015-04-24 18:01:01 +02:00
Tomasz Grabiec
731a63e371 schema: Embed raw_schema inside schema
Public fields got encapsulated.
2015-04-24 18:01:01 +02:00
Tomasz Grabiec
c963821e1d db: Extract schema-specific code to schema.cc 2015-04-23 20:54:12 +02:00
Avi Kivity
da8782b9e5 Merge branch 'tgrabiec/code-moves' of github.com:cloudius-systems/seastar-dev into db
Cleanups in preparation for memtables, from Tomasz.
2015-04-23 18:44:40 +03:00
Tomasz Grabiec
0d4821009c db: Move mutation and mutation_partition to separate headers and compilation units 2015-04-22 18:42:33 +02:00
Tomasz Grabiec
a5c201a685 db: Move column_family::get_partition_slice() to mutation_partition::query()
There's nothing column_family-specific there.
2015-04-22 17:40:02 +02:00
Tomasz Grabiec
de5bea90fe db: Add const qualifiers to mutation_partition methods 2015-04-22 17:37:40 +02:00
Tomasz Grabiec
631dad8a29 schema: Add const qualifiers to lookup methods 2015-04-22 17:36:27 +02:00
Gleb Natapov
57ac231cd2 convert some snitch related classes 2015-04-21 18:24:35 +03:00
Tomasz Grabiec
ef05c5b919 db: Lookup column family by UUID
It's a bit faster.
2015-04-20 12:12:55 +02:00
Tomasz Grabiec
5693f73b7a db: Implement generate_legacy_id() properly 2015-04-17 14:22:29 +02:00
Tomasz Grabiec
00f99cefd4 db: split query.hh to reduce header dependencies 2015-04-15 20:44:59 +02:00
Tomasz Grabiec
878a740b9d db: Write query results in serialized form
This gives about 30% increase in tps in:

  build/release/tests/perf/perf_simple_query -c1 --query-single-key

This patch switches query result format from a structured one to a
serialized one. The problems with structured format are:

  - high level of indirection (vector of vectors of vectors of blobs), which
    is not CPU cache friendly

  - high allocation rate due to fine-grained object structure

On replica side, the query results are probably going to be serialized
in the transport layer anyway, so this change only subtracts
work. There is no processing of the query results on replica other
than concatenation in case of range queries. If query results are
collected in serialized form from different cores, we can concatenate
them without copying by simply appending the fragments into the
packet. This optimization is not implemented yet.

On coordinator side, the query results would have to be parsed from
the transport layer buffers anyway, so this also doesn't add work, but
again saves allocations and copying. The CQL server doesn't need
complex data structures to process the results, it just goes over it
linearly consuming it. This patch provides views, iterators and
visitors for consuming query results in serialized form. Currently the
iterators assume that the buffer is contiguous but we could easily
relax this in future so that we can avoid linearization of data
received from seastar sockets.

The coordinator side could be optimized even further for CQL queries
which do not need processing (eg. select * from cf where ...)  we
could make the replica send the query results in the format which is
expected by the CQL binary protocol client. So in the typical case the
coordinator would just pass the data using zero-copy to the client,
prepending a header.

We do need structure for prefetched rows (needed by list
manipulations), and this change adds query result post-processing
which converts serialized query result into a structured one, tailored
particularly for prefetched rows needs.

This change also introduces partition_slice options. In some queries
(maybe even in typical ones), we don't need to send partition or
clustering keys back to the client, because they are already specified
in the query request, and not queried for. The query results hold now
keys as optional elements. Also, meta-data like cell timestamp and
ttl is now also optional. It is only needed if the query has
writetime() or ttl() functions in it, which it typically won't have.
2015-04-15 20:44:50 +02:00
Tomasz Grabiec
ecc5d23456 db: Avoid copying of column_definition
Spotted in the perf profile.
2015-04-15 20:33:48 +02:00
Tomasz Grabiec
7ebc7830b7 db: Optimize column family lookup in query path 2015-04-15 20:33:48 +02:00
Tomasz Grabiec
06f198b10c schema: Add id field
It uniquely identifies column_family globally. Will be used for
column_family lookups.
2015-04-15 20:33:48 +02:00
Tomasz Grabiec
b34cdd76ae db: Make the whole database printable
For debugging purposes.
2015-04-15 20:33:48 +02:00
Tomasz Grabiec
0be6cec13f db: Add const qualifier to mutation_partition::range() 2015-04-15 20:33:48 +02:00
Avi Kivity
a190f2db79 db: drop compile-time dependeny on sstables
Move #include "sstables.hh" to .cc file.  Need to explicitly define
destructor for this.
2015-04-11 11:27:48 +03:00
Calle Wilund
bfa9b860a8 db: make database lookup functions explicitly non-modifying
To be more precise, do not take schema_ptr by value.
Fixes crashes in running smp > 1 where mutations applied across shards
(i.e. foreign memory) would cause schema_ptr:s to get out of sync (using
other shards ptr)
2015-04-08 12:25:05 +03:00
Avi Kivity
30b40bf7b1 db: make bytes even more distinct from sstring
bytes and sstring are distinct types, since their internal buffers are of
different length, but bytes_view is an alias of sstring_view, which makes
it possible of objects of different types to leak across the abstraction
boundary.

Fix this by making bytes a basic_sstring<int8_t, ...> instead of using char.
int8_t is a 'signed char', which is a distinct type from char, so now
bytes_view is a distinct type from sstring_view.

uint8_t would have been an even better choice, but that diverges from Origin
and would have required an audit.
2015-04-07 10:56:19 +03:00
Gleb Natapov
47ac784425 replication strategy
This patch converts (for very small value of 'converts') some
replication related classes. Only static topology is supported (it is
created in keyspace::create_replication_strategy()). During mutation
no replication is done, since messaging service is not ready yet,
only endpoints are calculated.
2015-04-02 16:16:39 +02:00
Tomasz Grabiec
66924090c6 Merge tag 'avi/functions/v1'
From Avi:

This patchsets completes the conversion of scalar functions (TOKEN is still
missing, and maybe others, but the infrastructure is there).

Conflicts:
	database.cc
2015-04-02 12:48:21 +02:00
Avi Kivity
955f1ebf06 db: fix to_hex(bytes_opt)
Result was inverted.
2015-04-01 20:16:00 +03:00
Avi Kivity
a9ce81a2f8 db: add ostream operator for exploded_clustering_prefix 2015-04-01 20:12:39 +03:00
Avi Kivity
bb4b303bba db: add ostream operators for atomic_cell 2015-04-01 20:12:39 +03:00
Calle Wilund
d3fe0c5182 Refactor db/keyspace/column_family toplogy
* database now holds all keyspace + column family object
* column families are mapped by uuid, either generated or explicit
* lookup by name tuples or uuid
* finder functions now return refs + throws on missing obj
2015-04-01 10:08:00 +02:00
Tomasz Grabiec
9e5a02421a db: Fix static row not being populated when query limit kicks in
Spotted during code review.
2015-03-30 18:38:26 +02:00
Tomasz Grabiec
b52cd91281 db: Properly determine row liveness
In CQL a row is considered as present if its row marker is live or it
has any cells live. The 'insert' statement creates a row
marker. Internally Origin handles that by inserting a special cell
whose name shares the prefix with other cells in that row.

One consequence of this way of things is that when we query a column
slice from sstables we will have to read the whole CQL row, even if
not all columns are queried. We won't have to include the data, but we
will need liveness information in order to commute it with other
mutations, so that we can finally determine if the row is live or not.
2015-03-30 09:07:01 +02:00
Tomasz Grabiec
f155da622f db: Move row limit check to the right place
Could have let in more rows than requested in range queries.
2015-03-30 09:07:01 +02:00
Tomasz Grabiec
70341ceb0a db: Return only live cells in query::result::row
The coordinator filters out dead data anyway.
2015-03-30 09:07:01 +02:00
Tomasz Grabiec
4aa74f1312 db: Make mutation_partition::clustered_row() return deletable_row reference 2015-03-30 09:07:00 +02:00
Tomasz Grabiec
2bcc368138 db: Move implementations to source file 2015-03-30 09:01:59 +02:00
Tomasz Grabiec
b8063cd76e cql3: Support for querying of static columns 2015-03-26 14:58:36 +01:00
Tomasz Grabiec
35b4199374 Merge remote-tracking branch 'dev/penberg/create-keyspace/v4'
From Pekka:

This series adds support for creating keyspaces. We already have the CQL
front-end implemented so all that remains is converging mutations in
legacy_schema_tables.cc as well as parts of migration_manager.hh and
wiring that up to the CQL execution path.
2015-03-26 14:25:54 +01:00
Avi Kivity
1c1c4f923a db: fix collection_type_impl::deserialize_mutation_form() types
It accepts a bytes_view instead of the type-safe wrapper.
2015-03-26 14:31:01 +02:00
Pekka Enberg
3150bb5b78 database: Initialize system keyspace in database constructor
System keyspace is used for things like keyspace and table metadata.
Initialize it in database constructor so that they're always available.
Needed for CQL create keyspace test case, for example.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-03-26 12:41:00 +02:00
Avi Kivity
bfa37eb2f8 db: implement get_partition_slice() for collections 2015-03-26 12:14:01 +02:00
Avi Kivity
30c3348702 db: add ostream support to consistency_level 2015-03-26 09:34:49 +02:00
Tomasz Grabiec
b26b39504a db: Add find_or_create_keyspace()
Needed for tests.
2015-03-25 10:36:19 +01:00