This function is likely to be duplicated over time, so let's make
it available as a static method of sstable class.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
file_writer wraps output_stream<char> around.
file_writer also adds the method offset(), intended to return the current
offset of the stream.
It's also a small step towards compression support where a specialized output
stream is required.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Reviewed-by: Glauber Costa <glommer@cloudius-systems.com>
A long running query may use memtables and sstables that are being removed
by memtable flush and sstable compaction paths. Use copy-on-write to
prevent use-after-free.
If keyspace already exists, throw a already_exists_exception like Origin
does. Spotted while reading the code.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
When thrift sees an exception that was not declared as part of the interface,
it wraps it using std::exception::what() for the exception text. This is
often cryptic, so add an "Internal server error" prefix.
The test tests/memcached/test_ascii_parser hung after the change to
consume(). The problem was that consume() notified the consumer of an
EOF by sending it an empty buffer, and then it expected to get back a
message that it shouldn't read more (by setting the unconsumed buffer),
if it didn't, it continued to send empty buffers in a never-ending loops.
So this patch changes consume() to send one empty buffer to the consumer
on the end-of-file, and then stop (regardless of what the consumer returns).
It would have probably made sense to *not* send an empty buffer to the
consumer after the data is done - not even once - but if we change this
behavior, it will break the existing tests.
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
Tested-by: Tomasz Grabiec <tgrabiec@cloudius-systems.com>
We cannot capture keyspace_metadata by reference because it can be
allocated on the stack. Fixes SIGSEGV while running cassandra-stress.
The bug was introduced in commit commit cd35617 ("database: Use
keyspace_metadata for creation functions").
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
The CQL server is really noisy during cassandra-stress run because it
prints out STARTUP message options. Fix that up.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Glauber says:
"This is the basic bloom filter implementation. We can read an existing
filter file, but we won't construct one (pending on Index creation)
TODO:
- add the tracker bloom filter (to track its statistics)
- write the filter file"
From Avi:
"This patchset prepares for adding sstables to the read path. Because sstables
involve I/O, their APIs return futures, which means that APIs that may call
those sstable APIs also need to return futures.
This patchset uses the two-space indent + do_with + reference aliases trick
to make patches more readable. Cleanup patches will follow once it is merged."
Pekka says:
"Clean up keyspace creation functions by using keyspace_metadata. While
at it, simplify creation by moving replication strategy initialization
to database.cc. This paves the way for adding keyspace metadata lookup
functionality to database that's needed by migration manager."
Our input_stream::consume() mechanism allows feeding data from an input
stream into a consumer, piece by piece, until the consumer doesn't want
any more. It currently assumed the input can block (when reading from disk),
but the consumption is assumed to be immediate. This patch adds support for
blocking in the consumption function: The consumer now returns a future
which it promises to fulfill after consuming the given buffer.
This patch goes further by somewhat simplifying (?) the interface of the
consumer. Instead of receiving a mysterious "done" lambda the consumer
is supposed to call when done (doesn't want any more input), the consumer
now returns a future<optional<temporary_buffer<char>>, which means:
1. The future is fulfilled when the consumer is done with this buffer
and either wants more - or wants to stop.
2. If the consumer wants to stop, it returns the *remaining* part of the
buffer it didn't want to process (this will be pushed back into the
input stream).
3. If the consumer is not done, and wants to consume more, it returns an
unset optional.
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
Use the provided filter instead of always returning true. For existing tables,
this arrives from the bloom filter file. We don't yet fully write a bloom
filter file.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
This comes from Origin, but the changes I had to do are quite large.
These files also represents many files, but I found it to be inconvenient
to keep all the originals, simply because we would end up with way too many
files: one .cc and one .hh per filter + an enveloping .hh so users could include
without knowing which filter to use.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Passing all arguments as template parameters was a nice trick, made possible by
the fact that all fields we were dealing with were part of the sstable
structure.
However, as we progress, it would make sense for some fields to be short-lived.
In that case, the that convenience turns into an inconvenience.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Origin calculates bloom filter based on the ByteBuffer value of the key,
not the token. As much as they themselves would like this to be different,
that filter ends up on disk, so we must follow it.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
The cql3.cc file was used to make sure CQL header files were being
compiled in early stages of the translation. It's obsolete now and only
slows down compilation so lets get rid of it.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Amnon says:
"This series adds the commitlog API. Current functionality is the
list of active files. After this patch a call to:
http://localhost:10000/commitlog/segments/active
Will return a list of the active commitlog files"
This adds the implementation of the commitlog API.
Current implementation contains:
/commitlog/segments/active
That returns a list of the active file names, and
/commitlog/segments/archiving
Which always return an empty list as we archiving is not supported at
the moment
The doc file is under:
/api-doc/commitlog
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This adds a method to return a vector with full-path to the active
segment names. It will be used by the API.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
Initialize replication strategy when keyspace is being created now that
we have access to keyspace_metadata.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Use the keyspace_metadata type for keyspace creation functions. This is
needed to be able to have a mapping from keyspace name to keyspace
metadata for various call-sites.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
The API needs the database distribute object to get information from it.
This adds a database reference the API context so it would be
available in the API.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
Pekka says:
"We are going to keep the ks_meta_data class around and use it in core
code like the migration manager. Therefore, clean up the class and move
it to the database.hh where user_types_metadata also is defined in. As a
bonus, this also fixes the circular dependency between ks_meta_data.hh
and database.hh."