"Kubernetes is unhappy with our Docker image because we start systemd
under the hood. Fix that by switching to use "supervisord" to manage the
two processes -- "scylla" and "scylla-jmx":
http://blog.kunicki.org/blog/2016/02/12/multiple-entrypoints-in-docker/
While at it, fix up "docker logs" and "docker exec cqlsh" to work
out-of-the-box, and update our documentation to match what we have.
Further work is needed to ensure Scylla production configuration works
as expected and is documented accordingly."
Calls like later() and with_gate() may allocate memory, although that is not
very common. This can create a problem in the sense that it will potentially
recurse and bring us back to the allocator during free - which is the very thing
we are trying to avoid with the call to later().
This patch wraps the relevant calls in the reclaimer lock. This do mean that the
allocation may fail if we are under severe pressure - which includes having
exhausted all reserved space - but at least we won't recurse back to the
allocator.
To make sure we do this as early as possible, we just fold both release_requests
and do_release_requests into a single function
Thanks Tomek for the suggestion.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <980245ccc17960cf4fcbbfedb29d1878a98d85d8.1470254846.git.glauber@scylladb.com>
Fix Docker Hub documentation to match what we have right now.
More work is needed in the following areas:
* How to make a cluster
* How to configure Docker image for production use
We configure the hostname in the "CQLSH_HOST" environment variable but
that is only picked up if we first start the shell. Setup the hostname
in $HOME/.cqlshrc file instead so that we can start "cqlsh" directly:
docker exec -it scylla cqlsh
We don't have systemd running on the image so "journalctl" is useless.
Log to stdout instead which has the nice benefit of making "docker logs"
produce meaningful output on the host.
Issue 1510 describes a scenario in which, under load, we allocate memory within
release_requests() leading to a reentry into an invalid state in our
blocked requests' shared_promise.
This is not easy to trigger since not all allocations will actually get to the
point in which they need a new segment, let alone have that happening during
another allocator call.
Having those kinds of reentry is something we have always sought to avoid with
release_requests(): this is the reason why most of the actual routine is
deferred after a call to later().
However, that is a trick we cannot use for updating the state of the blocked
requests' shared_promise: we can't guarantee when is that going to run, and we
always need a valid shared_promise, in a valid state, waiting for new requests
to hook into.
The solution employed by this patch is to make sure that no allocation
operations whatsoever happen during the initial part of release_requests on
behalf of the shared promise. Allocation is now deferred to first use, which
relieves release_requests() from all allocation duties. All it needs to do is
free the old object and signal to the its user that an allocation is needed (by
storing {} into the shared_promise).
Fixes#1510
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <49771e51426f972ddbd4f3eeea3cdeef9cc3b3c6.1470238168.git.glauber@scylladb.com>
Useful for triggerring core dump on allocation failure inside LSA,
which makes it easier to debug allocation failures. They normally
don't cause aborts, just fail the current operation, which makes it
hard to figure out what was the cause of allocation failure.
Message-Id: <1470233631-18508-1-git-send-email-tgrabiec@scylladb.com>
This patch adds the prometheus API it adds the proto library to the
compilation, adds an optional configuration parameter to change the
prometheus listening port and start the prometheus API in main.
To disable the prometheus API, set its listening port to 0.
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <1470231628-22831-2-git-send-email-amnon@scylladb.com>
This patch adds the prometheus API it adds the proto library to the
compilation, adds an optional configuration parameter to change the
prometheus listening port and start the prometheus API in main.
To disable the prometheus API, set its listening port to 0.
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <1470228764-19545-2-git-send-email-amnon@scylladb.com>
data_consume_rows_context needs to have close() called and the returned
future waited for before it can be destroyed. data_consume_context::impl
does that in the background upon its destruction.
However, it is possible that the sstable is removed before
data_consume_rows_context::close() completes in which case EBADF may
happen. The solution is to make data_consume_context::impl keep a
reference to the sstable and extend its life time until closing of
data_consume_rows_context (which is performed in the background)
completes.
Side effect of this change is also that data_consume_context no longer
requires its user to make sure that the sstable exists as long as it is
in use since it owns its own reference to it.
Fixes#1537.
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Message-Id: <1470222225-19948-1-git-send-email-pdziepak@scylladb.com>
Switch to supervisord to manage the two processes we have: Scylla server
and Scylla JMX proxy. We need this to make the Docker image run under
Kubernetes, which now fails as follows as we try to start the systemd
init process:
Couldn't find an alternative telinit implementation to spawn.
I have not seen other people hitting the issue, except for GitLab Docker
image:
https://gitlab.com/gitlab-org/gitlab-ce/issues/18612
which "solved" the problem by not running init...
https://gitlab.com/gitlab-org/omnibus-gitlab/merge_requests/838/diffs
Furthermore, the "supervisord" approach seems to be what people actually
use in Docker land:
http://blog.kunicki.org/blog/2016/02/12/multiple-entrypoints-in-docker/
The only downside is that we now sort of duplicate functionality that's
already in the systemd configuration files. However, we should work
towards Scylla figuring out its configuration rather than compose a long
list of command line arguments. Once we do that, the duplication in
Docker supervisord scripts disappears.
"This patch series ensures we don't count dead partitions (i.e.,
partitions with no live rows) towards the partition_limit. We also
enforce the partition limit at the storage_proxy level, so that
limits with smp > 1 works correctly."
With this patch we stop counting dead partitions (i.e., partitions
containing only tombstones) towards the partition limit, which
should apply only to partitions with live rows.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
With this patch we stop counting dead partitions (i.e., partitions
containing only tombstones) towards the partition limit, which should
apply only to partitions with live rows.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
This reverts commit 141ea49e05.
There was a confusion around the meaning of "partition limit".
Parts of our code interpreted it just as "maximum number of partitions".
This is also how Cassandra behaves.
However, the other parts of the code, including data query, interpreted
it as "maximum number of live partitions" or otherwise skipped dead
partitions resulting in #1447.
A decision has been made to stick to the "maximum number of live
partitions" interpretation everywhere. The consequences are, among
others, that the patch reverted by this one is no longer correct.
While, the actual series fixing the interpretations of partition limit
and getting rid of the confusion is yet to come, the purpose of this
revert is to make backporting easier (as the patch being reverted
hasn't made it to branch-1.3 yet).
This patch ensures that when the schema is dense, regardless of
compact_storage being set, the single regular columns is translated
into a compact column.
This fixes an issue where Thrift dynamic column families are
translated to a dense schema with a regular column, instead of a
compact one.
Since a compact column is also a regular column (e.g., for purposes of
querying), no further changes are required.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1470062410-1414-1-git-send-email-duarte@scylladb.com>
This series adds the ability for partition cache to keep information
whether partition size makes it uncacheable. During, reads these
entries save us IO operations since we already know that the partiiton
is too big to be put in the cache.
First part of the patchset makes all mutation_readers allow the
streamed_mutations they produce to outlive them, which is a guarantee
used later by the code handling reading large partitions.
Inherit the alignment parameters from the underlying file instead of
defaulting to 4096. This gives better read performance on disks with 512-byte
sectors.
Fixes#1532.
Message-Id: <1470122188-25548-1-git-send-email-avi@scylladb.com>
"schema_altering_statement is both a cql_statement and a
prepared_statement.
This makes it hard to understand because virtual functions from both
base classes are present, and hard to separate the raw and prepared
variants.
This patchset removes schema_altering_statement::prepare() (and the
enable_shared_from_this<> that makes it work) in preparation for
splitting its subclasses into raw and prepared variants (note that
create_table_statement was already split)."
Fixes errors like the one below:
(gdb) scylla memory
Python Exception <class 'gdb.error'> A syntax error in expression, near `memory::cpu_mem'.:
Error occurred in Python command: A syntax error in expression, near `memory::cpu_mem'.
Wrapping the symbol in quotes instructs GDB to lookup in the global
context instead of the context of current frame.
Message-Id: <1470050751-3167-1-git-send-email-tgrabiec@scylladb.com>
There was no way to setup correct repo when AMI is building by --localrpm option, since AMI does not have access to 'version' file, and we don't passed repo URL to the AMI.
So detect optimal repo path when starting build AMI, passes repo URL to the AMI, setup it correctly.
Note: this changes behavor of build_ami.sh/scylla_install_pkg's --repo option.
It was repository URL, but now become .repo/.list file URL.
This is optimal for the distribution which requires 3rdparty packages to install scylla, like CentOS7.
Existing shell scripts which invoking build_ami.sh are need to change in new way, such as our Jenkins jobs.
Fixes#1414
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1469636377-17828-1-git-send-email-syuu@scylladb.com>
This series handle two issues:
* Moving to python2, though python3 is supported, there are modules that we
need that are not rpm installable, python3 would wait when it will be more
mature.
* Check version should send the current version when it check for a new one and
a simple string compare is wrong.
In preparation for splitting raw and prepared variants of subclasses
of schema_altering_statement, remove schema_altering_statment::prepare().
All subclasses already implement it themselves (by creating a new instance).
In preparation for the removal of schema_altering_statement::prepare(),
add a fake create_table_statement::prepare(). create_table_statement
has already been split to raw and prepared variants, so this prepare()
will never be called, but it is required because schema_altering_statement
is both a cql_statement and a prepared_statement. This confusion will
be fixed later on.
To prepare for the separation of prepared and raw schema_altering_statements,
avoid the reliance this class implementing both the raw and prepared
variants and the use of shared_from_this().
To prepare for the separation of prepared and raw schema_altering_statements,
avoid the reliance this class implementing both the raw and prepared
variants and the use of shared_from_this().
To prepare for the separation of prepared and raw schema_altering_statements,
avoid the reliance this class implementing both the raw and prepared
variants and the use of shared_from_this().
To prepare for the separation of prepared and raw schema_altering_statements,
avoid the reliance this class implementing both the raw and prepared
variants and the use of shared_from_this().
To prepare for the separation of prepared and raw schema_altering_statements,
avoid the reliance this class implementing both the raw and prepared
variants and the use of shared_from_this().
To prepare for the separation of prepared and raw schema_altering_statements,
avoid the reliance this class implementing both the raw and prepared
variants and the use of shared_from_this().
To prepare for the separation of prepared and raw schema_altering_statements,
avoid the reliance this class implementing both the raw and prepared
variants and the use of shared_from_this().
To prepare for the separation of prepared and raw schema_altering_statements,
avoid the reliance this class implementing both the raw and prepared
variants and the use of shared_from_this().
To prepare for the separation of prepared and raw schema_altering_statements,
avoid the reliance this class implementing both the raw and prepared
variants and the use of shared_from_this().
Once we encounter a wide partition store information
about this in cache entry and don't try to read it all
and cache next time it's requested.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
[Paweł: rebased, moved large partition reading logic to
cache_entry::read_wide()]
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
In case of failure BOOST_REQUIRE_EQUAL() is nicer and prints the actual
values that were supposed to be equal, but aren't.
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>