Commit Graph

6460 Commits

Author SHA1 Message Date
Avi Kivity
58a76ae04c Merge seastar upstream
* seastar aa18f5c...84cf099 (1):
  > rpc: do not wait for data to be sent
2015-09-14 19:57:07 +03:00
Pekka Enberg
c4323c306f configure.py: Fix CXXFLAGS when extra flags are specified
We need to specify all flags, not just the extra ones.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-09-14 14:02:16 +03:00
Avi Kivity
1bb9fbc85a Merge "Version numbering" from Pekka
"This series implements version numbering for the "scylla" executable as
well as the release RPM. Fixes #306."
2015-09-14 12:49:35 +03:00
Pekka Enberg
9790cafe49 dist/redhat: Use generated version number in spec file
Fix the hard-coded version number from RPM spec file by using the
SCYLLA-VERSION-GEN script.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-09-14 11:35:32 +03:00
Pekka Enberg
eab6094124 main: Print version number at startup
Now that we have a version number, lets tell the world about it!

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-09-14 11:35:32 +03:00
Pekka Enberg
5ef77a8a56 build: Add version number generation
This adds version number generation in the build system. Version numbers
follow the format:

  <version>-<release>

where release consists of:

  <date>-<git-hash>

The version and release numbers are generated by the SCYLLA-VERSION-GEN
script and they are stored in SCYLLA-VERSION-FILE and
SCYLLA-RELEASE-FILE files so that other parts of the build system can
easily pick them up.

For builds that happen from release tarballs, for example,
SCYLLA-VERSION-GEN looks for a "version" file in the tree and just uses
that.

Basically, we're doing pretty much the same as Git is doing in its build
system.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-09-14 11:23:31 +03:00
Avi Kivity
7e1d03d098 db: delete ignored sstables
If an sstable is irrelevant for a shard, delete it.  The deletion will
only complete when all shards agree (either ignore the sstable or
delete it after compaction).
2015-09-14 10:14:00 +02:00
Amnon Heiman
497e403387 API: Workaround for load_map
The get_load_map method should return a map between nodes addresses and
their load. In origin the implementation is based on the load
broadcaster that we currently do not have.

This workaround return a map with a single entry of the current node
address and its load
2015-09-13 12:50:07 +03:00
Avi Kivity
cab2148141 Merge "partial sstable handling" from Raphael
closes issue #75.
2015-09-13 12:03:50 +03:00
Raphael S. Carvalho
e65c91f324 db: avoid possible underflow on stats pending_compactions
In event of a compaction failure, run_compaction would be called
more than one time for a request, which could result in an
underflow in the stats pending_compactions.
Let's fix that by only decreasing it if compaction succeeded.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-09-13 11:59:34 +03:00
Gleb Natapov
17e54d0604 add logger for consistency level calculation 2015-09-13 11:59:17 +03:00
Avi Kivity
440cf4c94e Merge seastar upstream
* seastar 49989ca...aa18f5c (11):
  > stream: workaround native network stack drops
  > build: fix sanitize=vptr auto-disable
  > tests: test thread scheduling groups
  > thread: scheduling groups
  > thread: introduce thread_attributes
  > reactor: make later() more fair
  > reactor: introduce force_poll()
  > core: move later() out of line
  > test futurize
  > fix futurize<void> for the case in which Func returns a future
  > futures_test: silence exceptional future ignored messages

Fixes #187.
2015-09-13 11:43:08 +03:00
Raphael S. Carvalho
cdb31a0b4a sstable: kill temporary_filename
We no longer this functionality after TemporaryTOC.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-09-13 03:32:21 -03:00
Raphael S. Carvalho
538611ab93 sstable: delete sstable generation with temporary toc file
When populating a column family, we will now delete all components
of a sstable with a temporary toc file. A sstable with a temporary
TOC file means that it was partially written, and can be safely
deleted because the respective data is either saved in the commit
log, or in the compacted sstables in case of the partial sstable
being result of a compaction.
Deletion procedure is guarded against power failure by only deleting
the temporary TOC file after all other components were deleted.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-09-13 03:17:58 -03:00
Raphael S. Carvalho
7677202700 db: handle temporary TOC file when populating cf
When populating a cf, we should also check for a sstable with
temporary TOC file, and act accordingly. By the time being,
we will only refuse to boot. Subsequent work is to gather all
files of a sstable with a temporary TOC file and delete them.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-09-13 03:03:30 -03:00
Raphael S. Carvalho
1bd3a2d4bc sstable: create temporary TOC at an early stage
Currently, we create a temporary TOC file after we are done writing
all the other components. However, we want to create a temporary
TOC before starting to write any other component.
So if there is a missing TOC, there is likely to be a corruption,
so we should refuse to boot and provide the sysadmin with a
detailed message. If there is a temporary TOC, it means that there
was a sudden shutdown while the sstable was being written.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-09-13 03:02:17 -03:00
Raphael S. Carvalho
c729ea36e1 commitlog: guard commit log replay against reordering
After killing scylla in the middle of a write, the next scylla
instance failed to finish commit log replay, showing the following
error message:

scylla: core/future.hh:448: void promise<T>::set_value(A&& ...)
[with A = {}; T = {}]: Assertion `_state' failed.

After a long debug session, I figured out that check_valid_rp() was
triggering the exception replay_position_reordered_exception, which
means replay position reordering.

Looking at 8b9a63a3c6, I noticed that database::apply is guarded
against reodering, but commitlog replay code is not.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-09-12 06:17:14 -03:00
Tomasz Grabiec
9cd769210d Merge tag 'asias/gossip/aws_boot_mark_live_fix/v1' from seastar-dev.git
This series is supposed to fix the abort on aws cluster bootup.
2015-09-11 10:19:22 +02:00
Asias He
c44afca3d8 gossip: Make is_dead_state take const reference 2015-09-11 15:43:27 +08:00
Asias He
1f0542931e gossip: Fix handle_major_state_change
Modify the state in the map of endpoint_state_map instead of the local
variable.
2015-09-11 15:43:27 +08:00
Asias He
a31d3aa7ee gossip: Pass reference in mark_alive and real_mark_alive
We need to modify the state.
2015-09-11 15:43:27 +08:00
Avi Kivity
3d543132a9 build: configure seastar compiler with right -march flag
We set -march=nehalem to get a cross-microarch binary that supports the
crc32 instruction, but we don't configure seastar with that flag; so seastar
instead inherits dpdk's default, which is -march=native.  The resulting
binary may not run on hosts other than the one it was compiled on.

Fix by passing the flag to seastar and inheriting it from there.

Fixes #338.
2015-09-11 07:32:09 +03:00
Avi Kivity
9bae5d38b2 Merge seastar upstream
* seastar d693a64...49989ca (2):
  > memory: fix freeing of very small (<8 byte) objects with sized deallocation
  > core: take into an account the exceptions thrown directly from _next() as well

Fixes #341.
2015-09-10 20:03:04 +03:00
Paweł Dziepak
941cb1d40e cql3: update batch_statement::parsed::prepare_keyspace()
Since the changes in c0547160f4
"cql3: make ref to client_state const where possible" prepare_keyspace()
wasn't overriding what it was supposed to override.

Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
2015-09-10 19:02:05 +03:00
Avi Kivity
234f23621a Merge "API: Cache service priod and bloom filter" from Amnon
"This series complete the changes for nodetool info

It adds the following: in cache service API save priod will be reported as 0,
to indicate never

In column family, as a workaround for the bloom filter memeory consumption
statistic, the API would return 0."
2015-09-10 18:42:26 +03:00
Avi Kivity
16f76dc525 Merge "fixes on AMI support pt.3" from Takuya
"Updated patchset for fixes on AMI support.
It contains all uncommited patches including pt.2, plus fix for #340"
2015-09-10 18:39:50 +03:00
Amnon Heiman
9d099b3a8d API: Workaround for bloom filter memory calculation
The bloom filter memory calculation is missing, as a workaround until
it will be completed, the memory calculation will return 0.

It is needed by the nodetool info command.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-09-10 18:22:22 +03:00
Amnon Heiman
12a92939bf API: cache service save priod time to return 0
We do not save the cache, so the get for save priod in second should
return 0, to indicate never like it is done in origin.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-09-10 18:21:40 +03:00
Paweł Dziepak
977f2ff0a7 transport: do not call get_query_state() on other cpus
get_query_state() creates new query state if it cannot find existing
one. Because of that it cannot be safely called from multiple cpus.

The solution is taking advantage form the fact that
query_processor::prepare() only needs a const reference to client_state
object.

Fixes #329.

Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
2015-09-10 18:02:45 +03:00
Avi Kivity
13f86823f9 Merge "Enable persistence in tests using cql_test_env" from Tomasz
"The motivation is to exercise more code during tests, and possibly also avoid
some special casing just for tests in the future. Sstables will be persisted
in a unique temporary directory which is auto-removed when environment is
torn down."
2015-09-10 17:38:26 +03:00
Takuya ASADA
be2b1cbbb1 dist: support Instance Store as RAID disks
This patch enables to use Instance Store disks for RAID, usable on c3/m3.

Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
2015-09-10 14:29:22 +00:00
Takuya ASADA
77ea7a4203 dist: do not try to restart when scylla-server fails to startup
It tries to launch again and again, may good for product but bad for developing

Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
2015-09-10 14:29:15 +00:00
Takuya ASADA
a0157db54a dist: configure large coredump on AMI
Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
2015-09-10 14:28:59 +00:00
Takuya ASADA
763e8d3f59 dist: install debuginfo on AMI
Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
2015-09-10 14:28:53 +00:00
Takuya ASADA
7c48bad0ab dist: handle userdata parameters and connect reflector to receive seed list on AMI instance
Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
2015-09-10 14:28:44 +00:00
Takuya ASADA
b5d47c2b00 dist: we don't need copy of sysconfig file on ami directory anymore, since dir settings in yaml
Signed-off-by: Takuya ASADA <syuu@cloudius-systems.com>
2015-09-10 14:28:37 +00:00
Paweł Dziepak
c0547160f4 cql3: make ref to client_state const where possible
Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
2015-09-10 17:24:20 +03:00
Tomasz Grabiec
53caf5ecca lsa: Fix segment heap corruption
The segment heap is a max-heap, with sparser segments on the top. When
we free from a segment its occupancy is decreased, but its position in
the heap increases.

This bug caused that we picked up segments for compaction in the wrong
order. In extreme cases this can lead to a livelock, in some cases may
just increase compaction latency.
2015-09-10 17:20:04 +03:00
Avi Kivity
53239204f4 Merge "This series adds the stream_manager API." from Amnon
Conflicts:
	api/api.cc
2015-09-10 15:53:58 +03:00
Avi Kivity
8721fdf095 Merge "API: Adding memory consumption method to column_family" from Amnon 2015-09-10 15:47:34 +03:00
Gleb Natapov
396570b002 storage_proxy: fix digest_request_resolve completion reporting
Digest resolver is broken in a way that prevents read completion to
be reported if data arrives after enough digests for cl were already
received. This happens because the code tried to save on a state and
used _cl_responses as an indicator that completion was reported already,
but this is incorrect since there can be enough responses for cl, but no
data yet. Fix by introducing special state to track completion reporting.

Fixes #331
2015-09-10 15:46:15 +03:00
Gleb Natapov
bfaa771b87 storage_proxy: measure latency for failed writes too
This was the case before 0149a22f69
2015-09-10 15:45:21 +03:00
Gleb Natapov
04d2bef55b give preference to local data during query
Until dynamic snitch is implemented this is better than nothing.

Fixes #322
2015-09-10 15:45:20 +03:00
Paweł Dziepak
6a0d4e3ade client_state: verify that keyspace exist
Fixes #323.

Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
2015-09-10 13:58:48 +03:00
Amnon Heiman
a825977f85 API: return an error for wrong keyspace name
This patch addresses issue #155. It register an exception handler
API of the routes object that handle the no_such_keyspace exception.

The handler just throw a bad_parameter_exception with the error message
it got from the no_such_keyspace exception.

After this patch a call with a keyspace that does not exist, will return
a 400 result.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-09-10 12:53:21 +03:00
Avi Kivity
3ea086026b Merge seastar upstream
* seastar 4b70e9a...d693a64 (1):
  > Http server: Add an option to register general exception handler
2015-09-10 12:52:43 +03:00
Amnon Heiman
1238240eea API: Return 0 for key and counter cache's metrics
Until we'll support key and counter cache, it is reasonable to return 0
for their statistic and sizes.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-09-10 12:42:14 +03:00
Tomasz Grabiec
91e7dcfe10 row_cache: Don't count insertions and merges as hits and misses
Currently cache update which from a flushed memtable affects hits and
misses, which may be confusing. Let's reserve hits and misses for
reads. Cache update will affect counters called "insertions" and
"merges".
2015-09-10 12:41:27 +03:00
Tomasz Grabiec
f64ac3a80e row_cache: Extract scanning reader construction 2015-09-10 12:41:27 +03:00
Tomasz Grabiec
447e59eaf9 row_cache: Expose a metric for the number of cached partitions
Fixes #193.
2015-09-10 12:41:12 +03:00