After this change, user can query compression ratio on a per column
family basis with 'nodetool cfstats'.
look at 'nodetool cfstats' output:
./bin/nodetool cfstats ks.test5
Keyspace: ks
Read Count: 0
Read Latency: NaN ms.
Write Count: 0
Write Latency: NaN ms.
Pending Flushes: 0
Table: test5
SSTable count: 1
Space used (live): 4774
Space used (total): 4774
Space used by snapshots (total): 0
Off heap memory used (total): 131384
SSTable Compression Ratio: 0.833333
...
Fixes#636.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <a1bee5a23fe63787df3e387a88f2d216ba4a4134.1459802771.git.raphaelsc@scylladb.com>
"batchlog_manager is modified to allow the storage_service to initate a bachlog
replay operation.
Refs #1085.
Tested with tests/batchlog_manager_test and batch_test.py"
With big rows I see contention in XFS allocations which cause reactor
thread to sleep. Commitlog is a main offender, so enlarge extent to
commitlog segment size for big files (commitlog and sstable Data files).
Message-Id: <20160404110952.GP20957@scylladb.com>
This is a left over from the re ordering of the API init. The api_doc
should be set first, so later API registration will enable their
relevent swagger doc.
Currently, the swagger documentation of the system API is not available.
Fixes#1160
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <1459750490-15996-1-git-send-email-amnon@scylladb.com>
Both build_rpm.sh and build_deb.sh will fail with "cannot stat 'xxx': No such file or directory" when scylla-server package is not installed, need to prevent it by --no-dereference option of cp.
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1459523585-9108-1-git-send-email-syuu@scylladb.com>
This is another unit test for range tombstone merging, introduced in commit
0fc9a5ee4d and rewritten in commit
99ecda3c96.
In this test, a single large deletion was broken up into several smaller
ranges, all with the same time stamps, so we should recombine them into
one row tombstone, instead of failing the read.
The sstable in this test case was artificially created using json2sstable.
We don't know how yet to produce such a case using Cassandra 2, but we
have seen a similar occurance in the wild, in a real SSTable.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <1459429243-15821-1-git-send-email-nyh@scylladb.com>
Logging is used in many places including those that shouldn't really
throw any exceptions (destructors, noexcept functions).
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
abstract_read_executor::reconcile() is supposed to make sure that
_result_promise is eventually set to either a result or an exception.
That may not happen however if reconciliation throws any exception
since only read timeouts are being caught. When that happends the
continuation chain becomes stuck.
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
HyperLogLog constructor promises that it only throws instances of
std::invalid_argument. That's a lie since it also adds elements to a
vector (and doesn't catch potential bad_allocs).
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Cassandra-derived tools (such as sstable2json) may write commitlog segments,
that Scylla cannot recognize. Since we now write them with a distinct name,
we can recognize the name and ignore these segments, as we know the data they
contain is not interesting.
Fixes#1112.
Message-Id: <1459356904-20699-1-git-send-email-avi@scylladb.com>
The check_marker() function is use as a sanity-check of data we read
from sstable, so instead of the header file key.hh, let's move it to
the sstable-parsing source file partition.cc.
In addition to having less code in header files, another benefit is
that the function can now throw a more specific exception (malformed
sstable exception).
Also fixed the exception's message (which had a second "%d" but only
one parameter).
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <1459420430-5968-1-git-send-email-nyh@scylladb.com>
Until recently, we believed that range tombstones we read from sstables will
always be for entire rows (or more generalized clustering-key prefixes),
not for arbitrary ranges. But as we found out, because Cassandra insists
that range tombstones do not overlap, it may take two overlapping row
tombstones and convert them into three range tombstones which look like
general ranges (see the patch for a more detailed example).
Not only do we need to accept such "split" range tombstones, we also need
to convert them back to our internal representation which, in the above
example, involves two overlapping tombstones. This is what this patch does.
This patch also contains a test for this case: We created in Cassandra
an sstable with two overlapping deletions, and verify that when we read
it to Scylla, we get these two overlapping deletions - despite the
sstable file actually having contained three non-overlapping tombstones.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <b7c07466074bf0db6457323af8622bb5210bb86a.1459399004.git.glauber@scylladb.com>
The default shell in Ubuntu is "dash" which causes the following error
when "scylla-start" script is executed:
/start-scylla: 8: /start-scylla: source: not found
Message-Id: <1459406561-20141-1-git-send-email-penberg@scylladb.com>
This patch overrides the antlr3 function that allocates the missing
tokens that would eventually leak. The override stores these tokens in
a vector, ensuring memory is freed whenever the parser is destroyed.
Fixes#1147
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1459355146-17402-1-git-send-email-duarte@scylladb.com>
antlr3 leaks the token itself creates when recovering from a mismatch in
the case the missing token can be determined. Until this bug is fixed
or circumvented, the test should remain disabled.
Ref #1147
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1459345403-8243-1-git-send-email-duarte@scylladb.com>
During decommission, the storage_service::unbootstrap() needs to
initiate a batchlog replay operation. To sync the replay operation
initiated by the timer in batchlog_manager and storage_service, a
semaphore is introduced. To simplify the semaphore locking, the
management code now always runs on shard zero, but the real work is
distruted to all shards.
This is a rewrite of Glauber's earlier patch to do the same thing, taking
into account Avi's comments (do not use a class, do not throw from the
constructor, etc.). I also verified that the actual use case which was
broken in #1136 was fixed by this patch.
Currently, we have no support for range tombstones because CQL will not
generate them as of version 2.x. Thrift will, but we can safely leave this for
the future.
However, we have seen cases during a real migration in which a pure-CQL
Cassandra would generate range tombstones in its SSTables.
Although we are not sure how and why, those range tombstones were of a special
kind: their end and next's start range were adjacent, which means that in
reality, they could very well have been written as a single range tombstone for
an entire clustering key - which we support just fine.
This code will attempt to fix this problem temporarily by merging such ranges
if possible. Care must be taken so that we don't end up accepting a true
generic range tombstone by accident.
Fixes#1136
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <1459333972-20345-1-git-send-email-nyh@scylladb.com>
As Nadav noticed in his bug report, check_marker is creating its error messages
using characters instead of numbers - which is what we intended here in the
first place.
That happens because sprint(), when faced with an 8-byte type, interprets this
as a character. To avoid that we'll use uint16_t types, taking care not to
sign-extend them.
The bug also noted that one of the error messages is missing a parameter, and
that is also fixed.
Fixes#1122
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <74f825bbff8488ffeb1911e626db51eed88629b1.1459266115.git.glauber@scylladb.com>
When we, for some reason, fail to compact an SSTable, we do not log the file
name leaving us with cryptic messages that tell us what happened, but not where
it happened.
This patch adds logging in compaction so that we'll know what's going on.
Please note that readers are more of a concern, because the SSTable being
written technically do not exist yet. Still, better safe than sorry: if
open_data fails, or we leave an unfinished SSTable, it is still good to know
which one was the culprit.
Some argument can be made about whether we should log this at the lower SSTable
level, or at the compaction level.
The reason I am logging this at the compaction level, is that we don't really
know which exception will trigger, and where: it may be the case that we're
seeing exceptions that are not SSTable specific, and may not have the chance to
log it properly.
In particular, if the exception happens inside the reader: read_rows() and
friends only return a mutation reader, which doesn't really do anything until
we call read(). But at that time, we don't hold any pointers to the SSTable
anymore.
In Summary, logging at the compaction level guarantees that we always do it no
matter what. Exceptions that are part of the main SSTable path can log the file
name as well if they want: if that's the case, we'll be left with the name
appearing twice. That's totally harmless, and better than none.
Fixes#1123
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <c5c969fb6aeb788a037bd7a4ea69979c1042cb34.1459263847.git.glauber@scylladb.com>
commitlog's sync period is initialized as the batch period, and not as the
sync period itself as it should be.
I've found this by code inspection, but unless I am missing something
really fundamental, this seems to be completely wrong. It's been working
fine because in our defaults, I have checked that both variables default to
the same value. But it seems to me that as long as anyone would change one
of them, the behavior wouldn't be as expected.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <2e7c565242fe5d4481a3ee8b0ba425ef14f5e42a.1459252783.git.glauber@scylladb.com>
Sysv init script was added just for prevent warning message on lintian,
never really used by Ubuntu users. Result of that, we often break this
script since upstart/systemd unit file frequently changed. It may
confuse users, it's better to use Upstart only, just like Fedora/CentOS.
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1459177601-20269-2-git-send-email-syuu@scylladb.com>
On some Fedora environments such as Fedora official AMI, dnf-yum package
is not installed by default, causes command not found error when we run
our setup scripts.
To prevent this, we need to add dnf-yum to scylla-server package
dependency.
Fixes#1106
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1459099744-23068-1-git-send-email-syuu@scylladb.com>
After 4e52b41a4, remove_by_toc_name() became aware of temporary TOC
files, however, it doesn't consider that some components may be
missing if temporary TOC is present.
When creating a new sstable, the first thing we do is to write all
components into temporary TOC, so content of a temporary TOC isn't
reliable until it is renamed.
Solution is about implementing the following flow (described by Avi):
"Flow should be:
- remove all components in parallel
- forgive ENOENT, since the compoent may not have been written;
otherwise deletion error should be raised
- fsync the directory
- delete the temporary TOC
"
This problem can be reproduced by running compaction without disk
space, so compaction would fail and leave a partial sstable that would
be marked for deletion. Afterwards, remove_by_toc_name() would try to
delete a component that doesn't exist because it looked at the content
of temporary TOC.
Fixes#1095.
Signed-off-by: Raphael Carvalho <raphaelsc@scylladb.com>
Message-Id: <0cfcaacb43cc5bad3a8a7ea6c1fa6f325c5de97d.1459194263.git.raphaelsc@scylladb.com>