Commit Graph

10157 Commits

Author SHA1 Message Date
Avi Kivity
28ee2bdbd2 Merge "Docker image fixes" from Pekka
"Kubernetes is unhappy with our Docker image because we start systemd
under the hood. Fix that by switching to use "supervisord" to manage the
two processes -- "scylla" and "scylla-jmx":

  http://blog.kunicki.org/blog/2016/02/12/multiple-entrypoints-in-docker/

While at it, fix up "docker logs" and "docker exec cqlsh" to work
out-of-the-box, and update our documentation to match what we have.

Further work is needed to ensure Scylla production configuration works
as expected and is documented accordingly."
2016-08-04 15:11:18 +03:00
Pekka Enberg
394c8f8c4f dist/docker: Document Scylla cluster setup
Add instructions on how to make a cluster of two Scylla nodes.
2016-08-04 12:20:46 +03:00
Glauber Costa
fe6a0d97d1 logalloc: make sure allocations in release_requests don't recurse back into the allocator
Calls like later() and with_gate() may allocate memory, although that is not
very common. This can create a problem in the sense that it will potentially
recurse and bring us back to the allocator during free - which is the very thing
we are trying to avoid with the call to later().

This patch wraps the relevant calls in the reclaimer lock. This do mean that the
allocation may fail if we are under severe pressure - which includes having
exhausted all reserved space - but at least we won't recurse back to the
allocator.

To make sure we do this as early as possible, we just fold both release_requests
and do_release_requests into a single function

Thanks Tomek for the suggestion.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <980245ccc17960cf4fcbbfedb29d1878a98d85d8.1470254846.git.glauber@scylladb.com>
2016-08-04 11:16:53 +02:00
Pekka Enberg
7deddbe17a dist/docker: Fix Docker Hub documentation
Fix Docker Hub documentation to match what we have right now.

More work is needed in the following areas:

* How to make a cluster
* How to configure Docker image for production use
2016-08-04 10:05:08 +03:00
Pekka Enberg
6c8c60a5fc dist/docker: Setup hostname in cqlshrc
We configure the hostname in the "CQLSH_HOST" environment variable but
that is only picked up if we first start the shell. Setup the hostname
in $HOME/.cqlshrc file instead so that we can start "cqlsh" directly:

  docker exec -it scylla cqlsh
2016-08-04 09:57:08 +03:00
Pekka Enberg
d0aeb53e7c dist/docker: Log to stdout instead of syslog
We don't have systemd running on the image so "journalctl" is useless.
Log to stdout instead which has the nice benefit of making "docker logs"
produce meaningful output on the host.
2016-08-04 09:46:26 +03:00
Glauber Costa
ad58691afb logalloc: make sure blocked requests memory allocations are served from the standar allocator
Issue 1510 describes a scenario in which, under load, we allocate memory within
release_requests() leading to a reentry into an invalid state in our
blocked requests' shared_promise.

This is not easy to trigger since not all allocations will actually get to the
point in which they need a new segment, let alone have that happening during
another allocator call.

Having those kinds of reentry is something we have always sought to avoid with
release_requests(): this is the reason why most of the actual routine is
deferred after a call to later().

However, that is a trick we cannot use for updating the state of the blocked
requests' shared_promise: we can't guarantee when is that going to run, and we
always need a valid shared_promise, in a valid state, waiting for new requests
to hook into.

The solution employed by this patch is to make sure that no allocation
operations whatsoever happen during the initial part of release_requests on
behalf of the shared promise.  Allocation is now deferred to first use, which
relieves release_requests() from all allocation duties. All it needs to do is
free the old object and signal to the its user that an allocation is needed (by
storing {} into the shared_promise).

Fixes #1510

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <49771e51426f972ddbd4f3eeea3cdeef9cc3b3c6.1470238168.git.glauber@scylladb.com>
2016-08-03 20:40:30 +02:00
Tomasz Grabiec
9476bc5a31 Introduce --abort-on-lsa-bad-alloc command line option
Useful for triggerring core dump on allocation failure inside LSA,
which makes it easier to debug allocation failures. They normally
don't cause aborts, just fail the current operation, which makes it
hard to figure out what was the cause of allocation failure.

Message-Id: <1470233631-18508-1-git-send-email-tgrabiec@scylladb.com>
2016-08-03 17:26:44 +03:00
Avi Kivity
9df4ac53e5 conf: synchronize internode_compression between scylla.yaml and code
Our default is "none", to give reasonable performance, so have scylla.yaml
reflect that.
2016-08-03 16:50:48 +03:00
Amnon Heiman
b18b067b26 Add prometheus API
This patch adds the prometheus API it adds the proto library to the
compilation, adds an optional configuration parameter to change the
prometheus listening port and start the prometheus API in main.

To disable the prometheus API, set its listening port to 0.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <1470231628-22831-2-git-send-email-amnon@scylladb.com>
2016-08-03 16:49:42 +03:00
Amnon Heiman
bb4268a8a5 Add prometheus API
This patch adds the prometheus API it adds the proto library to the
compilation, adds an optional configuration parameter to change the
prometheus listening port and start the prometheus API in main.

To disable the prometheus API, set its listening port to 0.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <1470228764-19545-2-git-send-email-amnon@scylladb.com>
2016-08-03 15:55:18 +03:00
Duarte Nunes
1516cd4c08 schema: Dense schemas are correctly upgrades
When upgrading a dense schema, we would drop the cells of the regular
(compact) column. This patch fixes this by making the regular and
compact column kinds compatible.

Fixes #1536

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1470172097-7719-1-git-send-email-duarte@scylladb.com>
2016-08-03 13:39:01 +02:00
Paweł Dziepak
02ffc28f0d sstables: extend sstable life until reader is fully closed
data_consume_rows_context needs to have close() called and the returned
future waited for before it can be destroyed. data_consume_context::impl
does that in the background upon its destruction.

However, it is possible that the sstable is removed before
data_consume_rows_context::close() completes in which case EBADF may
happen. The solution is to make data_consume_context::impl keep a
reference to the sstable and extend its life time until closing of
data_consume_rows_context (which is performed in the background)
completes.

Side effect of this change is also that data_consume_context no longer
requires its user to make sure that the sstable exists as long as it is
in use since it owns its own reference to it.

Fixes #1537.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Message-Id: <1470222225-19948-1-git-send-email-pdziepak@scylladb.com>
2016-08-03 13:19:08 +02:00
Pekka Enberg
59bd5e485b dist/docker: Use supervisord to manage multiple processes
Switch to supervisord to manage the two processes we have: Scylla server
and Scylla JMX proxy. We need this to make the Docker image run under
Kubernetes, which now fails as follows as we try to start the systemd
init process:

  Couldn't find an alternative telinit implementation to spawn.

I have not seen other people hitting the issue, except for GitLab Docker
image:

  https://gitlab.com/gitlab-org/gitlab-ce/issues/18612

which "solved" the problem by not running init...

  https://gitlab.com/gitlab-org/omnibus-gitlab/merge_requests/838/diffs

Furthermore, the "supervisord" approach seems to be what people actually
use in Docker land:

  http://blog.kunicki.org/blog/2016/02/12/multiple-entrypoints-in-docker/

The only downside is that we now sort of duplicate functionality that's
already in the systemd configuration files. However, we should work
towards Scylla figuring out its configuration rather than compose a long
list of command line arguments. Once we do that, the duplication in
Docker supervisord scripts disappears.
2016-08-03 11:59:04 +03:00
Paweł Dziepak
5f11a727c9 Merge "partition_limit: Don't count dead partitions" from Duarte
"This patch series ensures we don't count dead partitions (i.e.,
partitions with no live rows) towards the partition_limit. We also
enforce the partition limit at the storage_proxy level, so that
limits with smp > 1 works correctly."
2016-08-03 09:49:30 +01:00
Duarte Nunes
db1118e4f7 database_test: Add case for partition limit
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-08-02 22:11:15 +00:00
Duarte Nunes
84e3969014 mutation_query_test: Add test for partition limit
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-08-02 22:08:39 +00:00
Duarte Nunes
b0c5996580 read_command: Add comment explaining partition_limit
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-08-02 21:17:06 +00:00
Duarte Nunes
54ad038aa6 storage_proxy: Enforce partition_limit
This patch enforces the partition_limit at the
mutation_result_merger.

Ref #693

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-08-02 21:17:06 +00:00
Duarte Nunes
ec490ffaba query_result_builder: Don't count dead partitions
With this patch we stop counting dead partitions (i.e., partitions
containing only tombstones) towards the partition limit, which
should apply only to partitions with live rows.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-08-02 21:17:06 +00:00
Duarte Nunes
167e400ca8 compact_mutation: Don't count dead partitions
With this patch we stop counting dead partitions (i.e., partitions
containing only tombstones) towards the partition limit, which should
apply only to partitions with live rows.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-08-02 21:17:06 +00:00
Paweł Dziepak
0f902738f0 Revert "storage_proxy: Enforce partition_limit"
This reverts commit 141ea49e05.

There was a confusion around the meaning of "partition limit".
Parts of our code interpreted it just as "maximum number of partitions".
This is also how Cassandra behaves.

However, the other parts of the code, including data query, interpreted
it as "maximum number of live partitions" or otherwise skipped dead
partitions resulting in #1447.

A decision has been made to stick to the "maximum number of live
partitions" interpretation everywhere. The consequences are, among
others, that the patch reverted by this one is no longer correct.

While, the actual series fixing the interpretations of partition limit
and getting rid of the confusion is yet to come, the purpose of this
revert is to make backporting easier (as the patch being reverted
hasn't made it to branch-1.3 yet).
2016-08-02 16:53:01 +01:00
Duarte Nunes
5995aebf39 schema_builder: Ensure dense tables have compact col
This patch ensures that when the schema is dense, regardless of
compact_storage being set, the single regular columns is translated
into a compact column.

This fixes an issue where Thrift dynamic column families are
translated to a dense schema with a regular column, instead of a
compact one.

Since a compact column is also a regular column (e.g., for purposes of
querying), no further changes are required.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1470062410-1414-1-git-send-email-duarte@scylladb.com>
2016-08-02 14:49:13 +02:00
Avi Kivity
fbc3377ad4 row_cache: add a counter for a miss that did not result in an insertion
Such misses are due to concurrent access to the same key.  Add a counter
to track this as it results in unnecessary I/O being performed.

See #1534.
Message-Id: <1470139871-14693-1-git-send-email-avi@scylladb.com>
2016-08-02 14:14:27 +02:00
Tomasz Grabiec
d2ed75c9ff Merge 'pdziepak/row-cache-wide-entries/v4' from seastar-dev.git
This series adds the ability for partition cache to keep information
whether partition size makes it uncacheable. During, reads these
entries save us IO operations since we already know that the partiiton
is too big to be put in the cache.

First part of the patchset makes all mutation_readers allow the
streamed_mutations they produce to outlive them, which is a guarantee
used later by the code handling reading large partitions.
2016-08-02 12:26:56 +02:00
Duarte Nunes
141ea49e05 storage_proxy: Enforce partition_limit
This patch enforces the partition_limit at the
mutation_result_merger.

Ref #693

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1470065526-3174-1-git-send-email-duarte@scylladb.com>
2016-08-02 10:10:43 +01:00
Avi Kivity
9f35e4d328 checked_file: preserve DMA alignment
Inherit the alignment parameters from the underlying file instead of
defaulting to 4096.  This gives better read performance on disks with 512-byte
sectors.

Fixes #1532.
Message-Id: <1470122188-25548-1-git-send-email-avi@scylladb.com>
2016-08-02 10:03:02 +01:00
Paweł Dziepak
d7123d21eb Merge "cql3: remove schema_altering_statement::prepare" from Avi
"schema_altering_statement is both a cql_statement and a
prepared_statement.
This makes it hard to understand because virtual functions from both
base classes are present, and hard to separate the raw and prepared
variants.

This patchset removes schema_altering_statement::prepare() (and the
enable_shared_from_this<> that makes it work) in preparation for
splitting its subclasses into raw and prepared variants (note that
create_table_statement was already split)."
2016-08-02 09:46:55 +01:00
Tomasz Grabiec
0c1bf6c861 scylla-gdb.py: Fix lookup of global symbols
Fixes errors like the one below:

  (gdb) scylla memory
  Python Exception <class 'gdb.error'> A syntax error in expression, near `memory::cpu_mem'.:
  Error occurred in Python command: A syntax error in expression, near `memory::cpu_mem'.

Wrapping the symbol in quotes instructs GDB to lookup in the global
context instead of the context of current frame.
Message-Id: <1470050751-3167-1-git-send-email-tgrabiec@scylladb.com>
2016-08-01 13:51:15 +01:00
Takuya ASADA
9b59bb59f2 dist/ami: Install scylla metapackage and debuginfo on AMI
Install scylla metapackage and debuginfo on AMI to make AMI to report bugs easier.
Fixes #1496

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1469635071-16821-1-git-send-email-syuu@scylladb.com>
2016-07-31 18:37:41 +03:00
Takuya ASADA
89b790358e dist/common/scripts: disable coredump compression by default, add an argument to enable compression on scylla_coredump_setup
On large memory machine compression takes too long, so disable it by default.
Also provide a way to enable it again.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1469706934-6280-1-git-send-email-syuu@scylladb.com>
2016-07-31 18:36:51 +03:00
Takuya ASADA
d3746298ae dist/ami: setup correct repository when --localrpm specified
There was no way to setup correct repo when AMI is building by --localrpm option, since AMI does not have access to 'version' file, and we don't passed repo URL to the AMI.
So detect optimal repo path when starting build AMI, passes repo URL to the AMI, setup it correctly.

Note: this changes behavor of build_ami.sh/scylla_install_pkg's --repo option.
It was repository URL, but now become .repo/.list file URL.
This is optimal for the distribution which requires 3rdparty packages to install scylla, like CentOS7.
Existing shell scripts which invoking build_ami.sh are need to change in new way, such as our Jenkins jobs.

Fixes #1414

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1469636377-17828-1-git-send-email-syuu@scylladb.com>
2016-07-31 18:36:21 +03:00
Avi Kivity
ec62f0d321 Merge "housekeeping: Switch to pytho2 and handle version" from Amnon
This series handle two issues:
* Moving to python2, though python3 is supported, there are modules that we
  need that are not rpm installable, python3 would wait when it will be more
  mature.

* Check version should send the current version when it check for a new one and
  a simple string compare is wrong.
2016-07-31 14:55:36 +03:00
Amnon Heiman
3170b477d0 ubuntu control.in: set python2 request dependency
scylla-housekeeping moved to python2, this change the dependency to take
the python2 requests module.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2016-07-31 10:47:22 +03:00
Amnon Heiman
c8bcb5a8bf scylla.spec: Set the python dependencies for housekeeping
The scylla-housekeeping moved to python2, this set the python
dependencies under redhat.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2016-07-31 10:40:40 +03:00
Avi Kivity
a6ec4aa547 cql3: schema_altering_statement: remove prepare()
In preparation for splitting raw and prepared variants of subclasses
of schema_altering_statement, remove schema_altering_statment::prepare().
All subclasses already implement it themselves (by creating a new instance).
2016-07-30 23:29:32 +03:00
Avi Kivity
0687dc3689 cql3: add fake create_table_statement::prepare()
In preparation for the removal of schema_altering_statement::prepare(),
add a fake create_table_statement::prepare().  create_table_statement
has already been split to raw and prepared variants, so this prepare()
will never be called, but it is required because schema_altering_statement
is both a cql_statement and a prepared_statement.  This confusion will
be fixed later on.
2016-07-30 23:26:28 +03:00
Avi Kivity
713cfc3182 cql3: copy drop_type_statement when preparing it
To prepare for the separation of prepared and raw schema_altering_statements,
avoid the reliance this class implementing both the raw and prepared
variants and the use of shared_from_this().
2016-07-30 23:14:49 +03:00
Avi Kivity
6631cd3277 cql3: copy drop_table_statement when preparing it
To prepare for the separation of prepared and raw schema_altering_statements,
avoid the reliance this class implementing both the raw and prepared
variants and the use of shared_from_this().
2016-07-30 23:14:35 +03:00
Avi Kivity
382235fca4 cql3: copy drop_keyspace_statement when preparing it
To prepare for the separation of prepared and raw schema_altering_statements,
avoid the reliance this class implementing both the raw and prepared
variants and the use of shared_from_this().
2016-07-30 23:14:18 +03:00
Avi Kivity
6a099f2690 cql3: copy create_type_statement when preparing it
To prepare for the separation of prepared and raw schema_altering_statements,
avoid the reliance this class implementing both the raw and prepared
variants and the use of shared_from_this().
2016-07-30 23:14:03 +03:00
Avi Kivity
2d961334d8 cql3: copy create_keyspace_statement when preparing it
To prepare for the separation of prepared and raw schema_altering_statements,
avoid the reliance this class implementing both the raw and prepared
variants and the use of shared_from_this().
2016-07-30 23:13:45 +03:00
Avi Kivity
35ef82e78d cql3: copy create_index_statement when preparing it
To prepare for the separation of prepared and raw schema_altering_statements,
avoid the reliance this class implementing both the raw and prepared
variants and the use of shared_from_this().
2016-07-30 23:12:55 +03:00
Avi Kivity
23eef0a610 cql3: copy alter_type_statement when preparing it
To prepare for the separation of prepared and raw schema_altering_statements,
avoid the reliance this class implementing both the raw and prepared
variants and the use of shared_from_this().
2016-07-30 23:12:24 +03:00
Avi Kivity
8b50f75958 cql3: copy alter_table_statement when preparing it
To prepare for the separation of prepared and raw schema_altering_statements,
avoid the reliance this class implementing both the raw and prepared
variants and the use of shared_from_this().
2016-07-30 23:10:43 +03:00
Avi Kivity
81b75dfa47 cql3: copy alter_keyspace_statement when preparing it
To prepare for the separation of prepared and raw schema_altering_statements,
avoid the reliance this class implementing both the raw and prepared
variants and the use of shared_from_this().
2016-07-30 22:45:08 +03:00
Avi Kivity
b881945d45 estimated_histograms: fix indentation, bracing 2016-07-30 20:13:16 +03:00
Avi Kivity
75ee8fc2a7 size_estimates_recorder: adjust indentation 2016-07-30 20:10:12 +03:00
Piotr Jastrzebski
ca9c29e296 Cache information about partition being wide
Once we encounter a wide partition store information
about this in cache entry and don't try to read it all
and cache next time it's requested.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
[Paweł: rebased, moved large partition reading logic to
cache_entry::read_wide()]
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-07-29 18:39:22 +01:00
Paweł Dziepak
42f566433e tests/row_cache_test: use BOOST_REQUIRE_EQUAL() istead of raw assert()
In case of failure BOOST_REQUIRE_EQUAL() is nicer and prints the actual
values that were supposed to be equal, but aren't.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-07-29 17:19:18 +01:00