Commit Graph

11580 Commits

Author SHA1 Message Date
Tomasz Grabiec
cd295e9926 sstables: Avoid moving an sstable
In preparation for adding non-movable members.
2017-03-28 18:10:39 +02:00
Tomasz Grabiec
5edb427873 sstables: Remove private constructor
To reduce duplication.
2017-03-28 18:10:39 +02:00
Tomasz Grabiec
705bd6da1a sstables: Remove unused method 2017-03-28 18:10:39 +02:00
Tomasz Grabiec
a7301a702f tests: Add missing blocking on fast_forward_to() 2017-03-28 18:10:39 +02:00
Tomasz Grabiec
5fe14735e8 tests: dht: Test ring_position_comparator 2017-03-28 18:10:39 +02:00
Tomasz Grabiec
ff6cca6e9e tests: Add utility for checking total orders 2017-03-28 18:10:39 +02:00
Tomasz Grabiec
d4b6e430ed dht: Introduce ring_position_view 2017-03-28 18:10:39 +02:00
Tomasz Grabiec
55a7cceef5 dht: Move comparison logic from ring_position::tri_compare() to ring_position_comparator
It will soon define common ordering for many objects, not just
ring_position.
2017-03-28 18:10:39 +02:00
Tomasz Grabiec
d5e704ca1e sstables: Make key_view constructor from bytes_view explicit 2017-03-28 18:10:39 +02:00
Tomasz Grabiec
65a8920b25 dht: Make min/max tokens capturable by reference
So that they can be later used in views.
2017-03-28 18:10:39 +02:00
Tomasz Grabiec
f6d2a07422 config: Warn on use of [[deprecated]] instead of failing 2017-03-28 18:10:39 +02:00
Calle Wilund
b12b65db92 commitlog/replayer: Bugfix: minimum rp broken, and cl reader offset too
The previous fix removed the additional insertion of "min rp" per source
shard based on whether we had processed existing CF:s or not (i.e. if
a CF does not exist as sstable at all, we must tag it as zero-rp, and
make whole shard for it start at same zero.

This is bad in itself, because it can cause data loss. It does not cause
crashing however. But it did uncover another, old old lingering bug,
namely the commitlog reader initiating its stream wrongly when reading
from an actual offset (i.e. not processing the whole file).
We opened the file stream from the file offset, then tried
to read the file header and magic number from there -> boom, error.

Also, rp-to-file mapping was potentially suboptimal due to using
bucket iterator instead of actual range.

I.e. three fixes:
* Reinstate min position guarding for unencoutered CF:s
* Fix stream creating in CL reader
* Fix segment map iterator use.

v2:
* Fix typo
Message-Id: <1490611637-12220-1-git-send-email-calle@scylladb.com>
2017-03-28 10:32:28 +02:00
Duarte Nunes
53014bd762 mutation_source_test: Ensure unique collection elements
Duplicate elements are illegal in collections, so we ensure they only
contain unique ones.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170327161149.8938-4-duarte@scylladb.com>
2017-03-27 18:44:11 +02:00
Duarte Nunes
94d568924d mutation_source_test: Sort collection elements
Ref #1607

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170327161149.8938-3-duarte@scylladb.com>
2017-03-27 18:43:58 +02:00
Duarte Nunes
4963902922 mutation_source_test: Remove extra randomness source
This patch ensures we generate UUIDs using the same randomness source
as all the other values we randomly generator, so that we can get a
deterministic run from the seeds we print.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170327161149.8938-2-duarte@scylladb.com>
2017-03-27 18:43:44 +02:00
Takuya ASADA
b84828b487 dist/common/scripts/scylla_fstrim: don't abort the program when a disk doesn't support TRIM
Do not abort the program until run fstrim on all directories.

Fixes #2220

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1490595397-19130-1-git-send-email-syuu@scylladb.com>
2017-03-27 11:43:19 +03:00
Avi Kivity
27c42359bc Merge seastar upstream
* seastar 6b21197...2ebe842 (6):
  > Merge "Various improvements to execution stages" from Paweł
  > app-template: allow apps to specify a name for help message
  > bool_class: avoid initializing object of incomplete type
  > app-template: make sure we can still get help with required options
  > prometheus: Http handler that returns prometheus 0.4 protobuf or text format
  > Update DPDK to 17.02

Includes patch from Pawel to adjust to updated execution_stage interface.
2017-03-26 10:50:21 +03:00
Takuya ASADA
e7697c37b2 dist/common/scripts/scylla_setup: warn twice before constructing RAID volume
Since RAID construction has possibility to destroy user data, warn twice before
executing scylla_raid_setup.

Fixes #1346

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1487891797-13109-1-git-send-email-syuu@scylladb.com>
2017-03-26 10:36:23 +03:00
Glauber Costa
f7f187a7f3 docker: do not touch yaml during startup
Users sometimes need to run their own yaml configuration files, and it
is currently annoying to deploy modified files on docker.

One possible solution is to bind mount the file into the docker
container using the -v switch, just like we already do for for the data
volume.

The problem with the aforementioned approach is that we have to change
the yaml file to insert the addresses, and that will change the file in
the host (or fail to happen, if we bind mount it read-only).

The solution I am proposing is to avoid touching the yaml file inside
the container altogether. Instead, we can deploy the address-related
arguments that we currently write to the yaml file as Scylla options.

Fixes #2113

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <1490195141-19940-1-git-send-email-glauber@scylladb.com>
2017-03-23 16:52:40 +02:00
Pekka Enberg
c5b1908e03 dist/docker: Use stdout as logging output
If early startup fails in docker-entrypoint.py, the container does not
start. It's therefore not very helpful to log to a file _within_ the
container...

Message-Id: <1490275943-23590-1-git-send-email-penberg@scylladb.com>
2017-03-23 16:48:34 +02:00
Calle Wilund
c3a510a08d commitlog_replayer: Do proper const-loopup of min positions for shards
Fixes #2173

Per-shard min positions can be unset if we never collected any
sstable/truncation info for it, yet replay segments of that id.

Wrap the lookups to handle "missing data -> default", which should have been
there in the first place.

Message-Id: <1490185101-12482-1-git-send-email-calle@scylladb.com>
2017-03-22 17:57:09 +02:00
Amnon Heiman
064f5e1b63 row_cache: switch to the metrics layer registration
This patch moves the row_cache metrics registration from collectd to the
metric layer.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <20170321143812.785-3-amnon@scylladb.com>
2017-03-21 16:42:58 +02:00
Amnon Heiman
a6a13865bf API: remove unneeded refrences to collectd
This patch removes left over references to the collectd from the API.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <20170321143812.785-2-amnon@scylladb.com>
2017-03-21 16:42:57 +02:00
Vlad Zolotarov
79978c156e transport::server: don't report a Tracing session ID unless requested
Don't report a Tracing session ID unless the current query had a Tracing bit in its
flags.

Although the current master's behaviour is legal it's suboptimal and some Clients are sensitive to that.
Let's fix that.

Fixes #2179

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <1490063752-8915-1-git-send-email-vladz@scylladb.com>
2017-03-21 13:57:08 +00:00
Vlad Zolotarov
9dd5b5762d dist: install seastar/scripts/perftune.py together with posix_net_conf.sh
posix_net_conf.sh is currently a wrapper for perftune.py script and
perftune.py has to be at the same directory as posix_net_conf.sh.


Fixes #2176.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <1490020882-32361-1-git-send-email-vladz@scylladb.com>
2017-03-21 12:31:50 +02:00
Amnon Heiman
7572addfbf column_family: metrics should be register once
column_family constructor uses delegation, as such, only the actual
constructor implementation should contain a call to register the
metrics.
Current implementation ends up with re registration of the metrics.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <20170320140817.22214-1-amnon@scylladb.com>
2017-03-21 11:20:29 +02:00
Pekka Enberg
85a127bc78 dist/docker: Expose Prometheus port by default
This patch exposes Scylla's Prometheus port by default. You can now use
the Scylla Monitoring project with the Docker image:

  https://github.com/scylladb/scylla-grafana-monitoring

To configure the IP addresses, use the 'docker inspect' command to
determine Scylla's IP address (assuming your running container is called
'some-scylla'):

  docker inspect --format='{{ .NetworkSettings.IPAddress }}' some-scylla

and then use that IP address in the prometheus/scylla_servers.yml
configuration file.

Fixes #1827

Message-Id: <1490008357-19627-1-git-send-email-penberg@scylladb.com>
2017-03-20 15:29:52 +02:00
Amos Kong
468df7dd5f scylla_setup: match '-p' option of lsblk with strict pattern
On Ubuntu 14.04, the lsblk doesn't have '-p' option, but
`scylla_setup` try to get block list by `lsblk -pnr` and
trigger error.

Current simple pattern will match all help content, it might
match wrong options.
  scylla-test@amos-ubuntu-1404:~$ lsblk --help | grep -e -p
   -m, --perms          output info about permissions
   -P, --pairs          use key="value" output format

Let's use strict pattern to only match option at the head. Example:
  scylla-test@amos-ubuntu-1404:~$ lsblk --help | grep -e '^\s*-D'
   -D, --discard        print discard capabilities

Signed-off-by: Amos Kong <amos@scylladb.com>
Message-Id: <4f0f318353a43664e27da8a66855f5831457f061.1489712867.git.amos@scylladb.com>
2017-03-20 08:10:35 +02:00
Raphael S. Carvalho
7deeffc953 database: serialize sstable cleanup
We're cleaning up sstables in parallel. That means cleanup may need
almost twice the disk space used by all sstables being cleaned up,
if almost all sstables need cleanup and every one will discard an
insignificant portion of its whole data.
Given that cleanup is frequently issued when node is running out of
disk space, we should serialize cleanups in every shard to decrease
the disk space requirement.

Fixes #192.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20170317022911.10306-1-raphaelsc@scylladb.com>
2017-03-19 12:33:03 +02:00
Tomasz Grabiec
bb0ce5d8fe Merge "Ensure base and view schema versions match" from Duarte
The mapping between a base table update and a view update is schema
dependent, so we need to ensure the view schema versions match the
base schema version. For example, we match base columns to view
columns by name, so we need to ensure the base and view schemas we're
using for writting are isolated with respect to a previous alter
table statement.

We thus need to match base schema versions with view schema versions,
and we need to so atomically to ensure that when one fiber sees a
schema, it also sees the complete set of corresponding view schemas.
This series ensures the schemas modified as a result of an alter
table statement are published atomically, under the schema lock. This
way, all the schemas referenced by the database are consistent with
each other when they are observed by other fibers.

Finally, we upgrade the mutation schema before generating the view
updates, to ensure it matches the most recent view schemas the base
replica knows about, registered in the database.

The db::view::view class was replaced by a set of non-member
functions, with its state, which used to reflect only the most recent
schema version, being moved to a new view_info class.
2017-03-17 12:40:00 +01:00
Duarte Nunes
b27da688f9 mutation: Remove dead get_cell() function
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170316234843.23130-1-duarte@scylladb.com>
2017-03-17 11:18:23 +02:00
Pekka Enberg
91b9e0d914 Update scylla-ami submodule
* dist/ami/files/scylla-ami eedd12f...407e8f3 (1):
  > scylla_create_devices: check block device is exists

Fixes #2171
2017-03-17 11:13:07 +02:00
Tomasz Grabiec
3609665b19 lsa: Fix debug-mode compilation error
By moving definitions of setters out of #ifdef
2017-03-16 18:23:05 +01:00
Tomasz Grabiec
88e7b3ff79 lsa: Ensure can_allocate_more_memory() always leaves a gap above seastar's min_free_memory()
One of the goals of can_allocate_more_memory() is to prevent depleting
seastar's free memory close to its minimum, leaving a head room above
that minimum so that standard allocations will not cause reclamation
immediately. Currently the function doesn't take into accoutn actual
threshold used by the seastar allocator, so there could be no gap or
even could go below the minimum.

Fix that by ensuring there's always a gap above min_free_memory().

min_gap was reduced to 1 MiB so that low memory setups are not
impacted significantly by the change.
Message-Id: <1489667863-15099-1-git-send-email-tgrabiec@scylladb.com>
2017-03-16 12:42:50 +00:00
Tomasz Grabiec
17ede24a77 Update seastar submodule
* seastar 4d25b85...6b21197 (3):
  > core: memory: Expose control of the free memory low water mark
  > scripts: add perftune.py
  > tutorial: make network examples work on multi-core
2017-03-16 13:32:45 +01:00
Pekka Enberg
3afd7f39b5 cql3: Wire up functions for floating-point types
Fixes #2168
Message-Id: <1489661748-13924-1-git-send-email-penberg@scylladb.com>
2017-03-16 11:04:59 +00:00
Avi Kivity
434a4fee28 Merge "tests: Use allocating_section in lsa_async_eviction_test" from Tomasz
"The test allocates objects in batches (allocation is always under a reclaim
lock) of ~3MiB and assumes that it will always succeed because if we cross the
low water mark for free memory (20MiB) in seastar, reclamation will be
performed between the batches, asynchronously.

Unfortunately that's prevented by can_allocate_more_memory(), which fails
segment allocation when we're below the low water mark. LSA currently doesn't
allow allocating below the low water mark.

The solution which is employed across the code base is to use allocating_section,
so use it here as well.

Exposed by recent consistent failures on branch-1.7."

* 'tgrabiec/fix-lsa-async-eviction-test' of github.com:cloudius-systems/seastar-dev:
  tests: lsa_async_eviction_test: Allocate objects under allocating section
  lsa: Allow adjusting reserves in allocating_section
2017-03-16 12:44:14 +02:00
Tomasz Grabiec
cefb6b604a tests: lsa_async_eviction_test: Allocate objects under allocating section 2017-03-16 10:21:10 +01:00
Tomasz Grabiec
4ab8b255da lsa: Allow adjusting reserves in allocating_section 2017-03-16 10:21:10 +01:00
Raphael S. Carvalho
6b6bb38f38 compaction_manager: stop manager after storage io error
Manager will stop itself if a compaction fails due to storage io
error, which unconditionally results in stop of transportation
services.

Fixes #2147.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20170316054538.23423-1-raphaelsc@scylladb.com>
2017-03-16 10:37:47 +02:00
Duarte Nunes
876a514743 database: Upgrade mutation to current schema to push view updates
This patch ensures we upgrade the mutation to the current schema when
generating and pushing view updates, so that the it matches the most
up to date views.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-03-15 18:15:27 +01:00
Duarte Nunes
be12a2bf0a db/schema_tables: Atomically publish base and view changes
This patch ensures that the schema merging atomically publishes
schema changes. In particular, it ensures that when a base schema
and a subset of its views are modified together (i.e., upon an alter
table or alter type statement), then they are published together as
well, without any deferring in-between.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-03-15 16:35:07 +01:00
Duarte Nunes
e215f25b11 migration_manager: Atomically migrate table and views
This patch changes the migration path for table updates such that the
base table mutations are sent and applied atomically with the view
schema mutations.

This ensures that after schema merging, we have a consistent mapping
of base table versions to view table versions, which will be used in
later patches.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-03-15 16:03:56 +01:00
Duarte Nunes
bfb8a3c172 materialized views: Replace db::view::view class
The write path uses a base schema at a particular version, and we
want it to use the materialized views at the corresponding version.

To achieve this, we need to map the state currently in db::view::view
to a particular schema version, which this patch does by introducing
the view_info class to hold the state previously in db::view::view,
and by having a view schema directly point to it.

The changes in the patch are thus:

1) Introduce view_info to hold the extra view state;
2) Point to the view_info from the schema;
3) Make the functions in the now stateless db::view::view non-member;
4) Remove the db::view::view class.

All changes are structural and don't affect current behavior.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-03-15 15:50:05 +01:00
Duarte Nunes
a64c47f315 schema: Move raw_view_info outside of raw_schema
In preparation of an upcoming patch, where the schema
won't directly store the raw_view_info.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-03-15 15:38:31 +01:00
Duarte Nunes
4b209be8b8 view_info: Rename to raw_view_info
In preparation for upcoming patches, which will deal with
moving the state in db::view::view to view_info.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-03-15 15:38:31 +01:00
Paweł Dziepak
dc99197318 Merge "Correctly handle tombstoned collections" from Duarte
"The current implementations of collection_type_impl::is_empty() and
collection_type_impl::difference() don't handle tombstoned collection
mutations correctly. In particular:

- is_empty() considers a collection mutation with a tombstone (and no
  entries) as empty;
- difference() doesn't do set difference between the cells tombstones,
  and always returns the highests.

Fixes #2152"

* 'collection-diff/v4' of github.com:duarten/scylla:
  mutation_test: Add more test cases for difference()
  mutation_source_test: Randomly generate collection cells
  collection_type_impl: Use set difference for tombstones
  collection_type_impl: A mutation with a tombstone is not empty
2017-03-15 13:39:55 +00:00
Duarte Nunes
143136647a mutation_test: Add more test cases for difference()
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-03-15 14:34:01 +01:00
Duarte Nunes
005e4741e3 mutation_source_test: Randomly generate collection cells
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-03-15 14:34:01 +01:00
Duarte Nunes
61741a69b6 collection_type_impl: Use set difference for tombstones
This patch fixes collection_type_impl::difference() so it does set
difference for tombstones instead of just returning the larger
one, as difference() is supposed to return only the information in
mutation A that supersedes that in B, given difference(A, B).

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-03-15 14:34:01 +01:00