Commit Graph

10439 Commits

Author SHA1 Message Date
Asias He
774d16306f gossip: Use lowres_clock for scheduled_gossip_task
The timer is fired once per second. Using low resolution clock is enough.
Message-Id: <1f21514e975afea6ac5c9dde18a881a41561da70.1475130948.git.asias@scylladb.com>
2016-09-29 10:03:14 +03:00
Piotr Jastrzebski
1948ec8061 Update README.md
Add --init to git submodules update.
It's needed for fmt.
Add libunwind-devel dependency do dnf install.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <4918237f91d985649c195035c02b2dd9e9a1ff68.1475087373.git.piotr@scylladb.com>
2016-09-29 10:02:34 +03:00
Gleb Natapov
32989d1e66 Merge seastar upstream
* seastar 2b55789...5b7252d (3):
  > Merge "rpc: serialize large messages into fragmented memory" from Gleb
  > Merge "Print backtrace on SIGSEGV and SIGABRT" from Tomasz
  > test_runner: avoid nested optionals

Includes patch from Gleb to adapt to seastar changes.
2016-09-28 17:34:16 +03:00
Pekka Enberg
9ea24c9d2b Merge "repair: less stream_plan and less streaming traffic" from Asias
"This series improves repair by

 1) using less streaming sessions

 2) reducing unnecessary streaming traffic

 3) fixing a hang during shutdown

 See commit log for "repair: Reduce stream_plan usage", "repair: Reduce
 unnecessary streaming traffic" and "streaming: Fail streaming sessions
 during shutdown" for details.

 Tested with repair_additional_test.py."
2016-09-28 09:54:15 +03:00
Gleb Natapov
c95df8f053 messaging_service: use correct value for listen_to_bc_address is a constructor used by tests
Also make sure to not listen on the same exact address twice in case
listen_address == broadcast_address. Scylla configuration code does not
allow such thing to be configured, but better to be safe.

Message-Id: <20160927102316.GO32178@scylladb.com>
2016-09-27 11:27:23 +01:00
Pekka Enberg
e35166af10 Merge "gossip: Fix expire_time for gossip membership removal" from Asias
"We currently use steady_lock which is not consistent on nodes in the cluster.
 Use system_clock for it.

 Fixes #1704"
2016-09-27 11:46:09 +03:00
Asias He
1292341d77 gossip: Improve the expire time logging
Print when the node will be removed from gossip membership, e.g.,

INFO  2016-09-27 08:54:49,262 [shard 0] gossip - Node 127.0.0.3 will be
removed from gossip at [2016-09-30 08:54:48]: (expire = 1475196888294489339,
now = 1474937689262295270, diff = 259199 seconds)
2016-09-27 16:42:35 +08:00
Asias He
f0d3084c8b gossip: Switch to use system_clock
The expire time which is used to decide when to remove a node from
gossip membership is gossiped around the cluster. We switched to steady
clock in the past. In order to have a consistent time_point in all the
nodes in the cluster, we have to use wall clock. Switch to use
system_clock for gossip.

Fixes #1704
2016-09-27 16:42:13 +08:00
Avi Kivity
bfa9aa5d23 Merge "Installing the node_exporter" from Amnon
"The prometheus project and its sub project does not have RPM/DEB packaging yet,
but it does have binaries for download.

This series adds an installation script that download install and run as a
service the node_exporter. For os that uses systemd it has a spec file ready
that will be package with the system. For ubuntu a service file will be created
when running the installer.

After this series running node_exporter_install a node_exporter will be running
as a service on the machine."
2016-09-27 11:00:00 +03:00
Asias He
802c25e67b repair: Switch to use make_streaming_reader in checksum calculation
In patch ac619820 (streaming: Switch to use make_streaming_reade), we
switched to use make_streaming_reader for streaming. In repair, the
checksum phases also uses a mutation reader. For the same reasons (no
pollution to row cache, bounded new data after the reader is created),
switch repair checksum calculation to use the make_streaming_reader too.

Fixes #382
Fixes #1682

Message-Id: <9e0ecda861bb0b6f690da5e2378b208159ffa41c.1474933195.git.asias@scylladb.com>
2016-09-27 10:58:31 +03:00
Tomasz Grabiec
c03568d687 Merge tag 'asias/read_data_from_sstable_in_streaming/v2' from seastar-dev.git
From Asias:

With this series, streaming and repair are improved:

    - streaming, repair will not pollute the row cache on the sender side
      any more. Currently, we are risking evicting all the frequently-queried
      partitions from the cache when an operation like repair reads entire
      sstables and floods the row cache with swathes of cold data from they
      read from disk.

    - less data will be sent becasue the reader will only return existing
      data before the point of the reader is created, plus bounded amount
      of writes which arrive later. This helps reducing the streaming time
      in the case new data is being inserted all the time while streaming is
      in progress. E.g., adding a new node while there is a lot of cql write
      workload.

Fixes #382 and #1682
2016-09-26 11:30:12 +02:00
Asias He
ac6198208b streaming: Switch to use make_streaming_reader
Using make_streaming_reader for streaming on the sender side, it has
the following advantages:

- streaming, repair will not pollute the row cache on the sender side
  any more. Currently, we are risking evicting all the frequently-queried
  partitions from the cache when an operation like repair reads entire
  sstables and floods the row cache with swathes of cold data from they
  read from disk.

- less data will be sent becasue the reader will only return existing
  data before the point of the reader is created, plus bounded amount
  of writes which arrive later. This helps reducing the streaming time
  in the case new data is being inserted all the time while streaming is
  in progress. E.g., adding a new node while there is a lot of cql write
  workload.

Fixes #382
Fixes #1682
2016-09-26 16:12:56 +08:00
Asias He
b505e34062 database: Introduce make_streaming_reader
The make_streaming_reader returns a combined mutation reader reads
mutations from sstables and memtable. The memtable reader handles
memtable flushing automatically so no special handling is needed here.

It will be used by streaming soon.
2016-09-26 16:02:48 +08:00
Asias He
e5a5a9ba15 repair: Rename sync_ranges to request_transfer_ranges
To refelct the fact that the function does not sync the ranges but add
the ranges to request from peer or transfer to peer.
2016-09-26 16:00:07 +08:00
Takuya ASADA
d38aa6570f dist/common/scripts/scylla_setup: do not ask to select disks when there's no free disk
When there's no free disk, it asks to select disks from empty list:

"Please select disks from following list:
type 'done' to finish selection. selected:"

We should avoid to ask it, abort RAID setup instead.

Fixes #1673

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1474429218-28382-1-git-send-email-syuu@scylladb.com>
2016-09-26 08:53:54 +03:00
Gleb Natapov
26ae8e8365 implement listen_on_broadcast_address option
When using multiple physical network interfaces, set this to true to
listen on broadcast_address in addition to the listen_address, allowing
nodes to communicate in both interfaces.  Ignore this property if the
network configuration automatically routes between the public and
private networks such as EC2.

Message-Id: <20160921094810.GA28654@scylladb.com>
2016-09-26 08:49:54 +03:00
Asias He
f377a3b7ac streaming: Fail streaming sessions during shutdown
Fixes repair_additional_test.py:RepairAdditionalTest.repair_kill_3_test

The test does:

- Insert data on node1 only
- Insert data on node2 only
- Run repair on node1 and stop node1
  once "starting user-requested repair" is seen

The repair shutdown code may wait for the stream session to complete for
a very long time if node 1 finishes sending data to node2 and is waiting
for node2 to send data to it, when node1 is stopped. The stream session
will not be closed in this case until stream session _keep_alive_timeout
(10 minutes) expires. Instead of waiting for the stream_session keep
alive timer to expire, we can fail all the stream sessions during
shutdown.

Before 1 - The bad case (repair shutdown will last for 10 minutes):

  INFO  2016-09-21 16:23:56,617 [shard 0] stream_session - [Stream #bd34fea1-7fd4-11e6-8020-000000000001] Executing streaming plan for repair-in
  INFO  2016-09-21 16:23:56,617 [shard 0] stream_session - [Stream #bd34fea1-7fd4-11e6-8020-000000000001] Starting streaming to 127.0.0.2
  INFO  2016-09-21 16:23:56,617 [shard 0] stream_session - [Stream #bd34fea1-7fd4-11e6-8020-000000000001] Beginning stream session with 127.0.0.2
  INFO  2016-09-21 16:23:56,618 [shard 0] stream_session - [Stream #bd34fea1-7fd4-11e6-8020-000000000001] Prepare completed with 127.0.0.2. Receiving 1, sending 0
  INFO  2016-09-21 16:23:58,625 [shard 0] storage_service - Stop transport: stop_gossiping done
  INFO  2016-09-21 16:23:58,625 [shard 0] storage_service - Thrift server stopped
  INFO  2016-09-21 16:23:58,625 [shard 0] storage_service - CQL server stopped
  INFO  2016-09-21 16:23:58,625 [shard 0] storage_service - Stop transport: shutdown rpc and cql server done
  INFO  2016-09-21 16:23:58,626 [shard 0] storage_service - messaging_service stopped
  INFO  2016-09-21 16:23:58,626 [shard 0] storage_service - Stop transport: shutdown messaging_service done
  INFO  2016-09-21 16:23:58,626 [shard 0] storage_service - Stop transport: auth shutdown
  INFO  2016-09-21 16:23:58,626 [shard 0] storage_service - Stop transport: done
  INFO  2016-09-21 16:23:58,626 [shard 0] storage_service - Drain on shutdown: stop_transport done
  INFO  2016-09-21 16:23:58,626 [shard 0] tracing - Asked to shut down
  INFO  2016-09-21 16:23:58,626 [shard 0] tracing - Tracing is down
  INFO  2016-09-21 16:23:58,626 [shard 1] tracing - Asked to shut down
  INFO  2016-09-21 16:23:58,626 [shard 1] tracing - Tracing is down
  INFO  2016-09-21 16:23:58,626 [shard 0] storage_service - Drain on shutdown: tracing is stopped
  INFO  2016-09-21 16:23:58,669 [shard 0] storage_service - Drain on shutdown: flush column_families done
  INFO  2016-09-21 16:23:58,669 [shard 0] storage_service - Drain on shutdown: shutdown commitlog done
  INFO  2016-09-21 16:23:58,669 [shard 0] storage_service - Drain on shutdown: done
  INFO  2016-09-21 16:23:58,669 [shard 0] repair - Starting shutdown of repair
  INFO  2016-09-21 16:25:56,624 [shard 0] stream_session - [Stream #bd34fea1-7fd4-11e6-8020-000000000001] The session 0x600021516c00 made no progress with peer 127.0.0.2

Before 2 - The good case:

  INFO  2016-09-21 16:18:32,087 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] Executing streaming plan for repair-in
  INFO  2016-09-21 16:18:32,087 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] Starting streaming to 127.0.0.2
  INFO  2016-09-21 16:18:32,087 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] Beginning stream session with 127.0.0.2
  INFO  2016-09-21 16:18:32,087 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] Prepare completed with 127.0.0.2. Receiving 1, sending 0
  INFO  2016-09-21 16:18:34,098 [shard 0] storage_service - Stop transport: stop_gossiping done
  INFO  2016-09-21 16:18:34,098 [shard 0] storage_service - Thrift server stopped
  INFO  2016-09-21 16:18:34,098 [shard 0] storage_service - CQL server stopped
  INFO  2016-09-21 16:18:34,098 [shard 0] storage_service - Stop transport: shutdown rpc and cql server done
  INFO  2016-09-21 16:18:34,155 [shard 0] messaging_service - Retry verb=19 to 127.0.0.2:0, retry=10: rpc::closed_error (connection is closed)
  WARN  2016-09-21 16:18:34,155 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] COMPLETE_MESSAGE for 127.0.0.2 has failed: rpc::closed_error (connection is closed)
  WARN  2016-09-21 16:18:34,155 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] Streaming error occurred
  INFO  2016-09-21 16:18:34,155 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] Session with 127.0.0.2 is complete, state=FAILED
  INFO  2016-09-21 16:18:34,155 [shard 0] storage_service - messaging_service stopped
  INFO  2016-09-21 16:18:34,155 [shard 0] storage_service - Stop transport: shutdown messaging_service done
  INFO  2016-09-21 16:18:34,155 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] bytes_sent = 0, bytes_received = 245000
  WARN  2016-09-21 16:18:34,155 [shard 0] stream_session - [Stream #fbc668d1-7fd3-11e6-bc54-000000000001] Stream failed, peers={127.0.0.2}
  WARN  2016-09-21 16:18:34,155 [shard 0] repair - repair's stream failed: streaming::stream_exception (Stream failed)
  INFO  2016-09-21 16:18:34,155 [shard 0] repair - repair 1 failed - streaming::stream_exception (Stream failed)
  INFO  2016-09-21 16:18:34,155 [shard 0] storage_service - Stop transport: auth shutdown
  INFO  2016-09-21 16:18:34,155 [shard 0] storage_service - Stop transport: done
  INFO  2016-09-21 16:18:34,155 [shard 0] storage_service - Drain on shutdown: stop_transport done
  INFO  2016-09-21 16:18:34,155 [shard 0] tracing - Asked to shut down
  INFO  2016-09-21 16:18:34,155 [shard 0] tracing - Tracing is down
  INFO  2016-09-21 16:18:34,156 [shard 1] tracing - Asked to shut down
  INFO  2016-09-21 16:18:34,156 [shard 1] tracing - Tracing is down
  INFO  2016-09-21 16:18:34,156 [shard 0] storage_service - Drain on shutdown: tracing is stopped
  INFO  2016-09-21 16:18:34,199 [shard 0] storage_service - Drain on shutdown: flush column_families done
  INFO  2016-09-21 16:18:34,199 [shard 0] storage_service - Drain on shutdown: shutdown commitlog done
  INFO  2016-09-21 16:18:34,199 [shard 0] storage_service - Drain on shutdown: done
  INFO  2016-09-21 16:18:34,199 [shard 0] repair - Starting shutdown of repair
  INFO  2016-09-21 16:18:34,199 [shard 0] repair - Completed shutdown of repair
  INFO  2016-09-21 16:18:34,199 [shard 0] compaction_manager - Asked to stop
  INFO  2016-09-21 16:18:34,199 [shard 1] compaction_manager - Asked to stop

After:

  INFO  2016-09-21 16:06:21,684 [shard 0] stream_session - [Stream #48661c51-7fd2-11e6-8ba7-000000000001] Executing streaming plan for repair-in
  INFO  2016-09-21 16:06:21,684 [shard 0] stream_session - [Stream #48661c51-7fd2-11e6-8ba7-000000000001] Starting streaming to 127.0.0.2
  INFO  2016-09-21 16:06:21,684 [shard 0] stream_session - [Stream #48661c51-7fd2-11e6-8ba7-000000000001] Beginning stream session with 127.0.0.2
  INFO  2016-09-21 16:06:21,685 [shard 0] stream_session - [Stream #48661c51-7fd2-11e6-8ba7-000000000001] Prepare completed with 127.0.0.2. Receiving 1, sending 0
  INFO  2016-09-21 16:06:23,687 [shard 0] storage_service - Stop transport: stop_gossiping done
  INFO  2016-09-21 16:06:23,687 [shard 0] storage_service - Thrift server stopped
  INFO  2016-09-21 16:06:23,687 [shard 0] storage_service - CQL server stopped
  INFO  2016-09-21 16:06:23,687 [shard 0] storage_service - Stop transport: shutdown rpc and cql server done
  INFO  2016-09-21 16:06:23,688 [shard 0] storage_service - messaging_service stopped
  INFO  2016-09-21 16:06:23,688 [shard 0] storage_service - Stop transport: shutdown messaging_service done
  INFO  2016-09-21 16:06:23,688 [shard 0] stream_session - [Stream #48661c51-7fd2-11e6-8ba7-000000000001] Session with 127.0.0.2 is complete, state=FAILED
  INFO  2016-09-21 16:06:23,688 [shard 0] storage_service - stream_manager stopped
  INFO  2016-09-21 16:06:23,688 [shard 1] storage_service - stream_manager stopped
  INFO  2016-09-21 16:06:23,688 [shard 0] stream_session - [Stream #48661c51-7fd2-11e6-8ba7-000000000001] bytes_sent = 0, bytes_received = 25725
  INFO  2016-09-21 16:06:23,688 [shard 0] storage_service - Stop transport: shutdown stream_manager done
  WARN  2016-09-21 16:06:23,688 [shard 0] stream_session - [Stream #48661c51-7fd2-11e6-8ba7-000000000001] Stream failed, peers={127.0.0.2}
  WARN  2016-09-21 16:06:23,688 [shard 0] repair - repair's stream failed: streaming::stream_exception (Stream failed)
  INFO  2016-09-21 16:06:23,688 [shard 0] repair - repair 1 failed - streaming::stream_exception (Stream failed)
  INFO  2016-09-21 16:06:23,688 [shard 0] storage_service - Stop transport: auth shutdown
  INFO  2016-09-21 16:06:23,688 [shard 0] storage_service - Stop transport: done
  INFO  2016-09-21 16:06:23,688 [shard 0] storage_service - Drain on shutdown: stop_transport done
  INFO  2016-09-21 16:06:23,688 [shard 0] tracing - Asked to shut down
  INFO  2016-09-21 16:06:23,688 [shard 0] tracing - Tracing is down
  INFO  2016-09-21 16:06:23,688 [shard 1] tracing - Asked to shut down
  INFO  2016-09-21 16:06:23,688 [shard 1] tracing - Tracing is down
  INFO  2016-09-21 16:06:23,688 [shard 0] storage_service - Drain on shutdown: tracing is stopped
  INFO  2016-09-21 16:06:23,774 [shard 0] storage_service - Drain on shutdown: flush column_families done
  INFO  2016-09-21 16:06:23,774 [shard 0] storage_service - Drain on shutdown: shutdown commitlog done
  INFO  2016-09-21 16:06:23,774 [shard 0] storage_service - Drain on shutdown: done
  INFO  2016-09-21 16:06:23,774 [shard 0] repair - Starting shutdown of repair
  INFO  2016-09-21 16:06:23,774 [shard 0] repair - Completed shutdown of repair
  INFO  2016-09-21 16:06:23,774 [shard 0] compaction_manager - Asked to stop
  INFO  2016-09-21 16:06:23,774 [shard 1] compaction_manager - Asked to stop
2016-09-26 06:29:40 +08:00
Asias He
7c873f0d1f repair: Reduce unnecessary streaming traffic
If the remote peers have the same checksum, we can only fetch from
one of the peer node instead of all of them since they all have the same
data anyway. No need to fetch from all of them.

In addition to above optimization, if the local peer has no data, we can
skip sending the data back to the remote peer. Due to the fact that all
the remote peers have the same checksum and local peer has no data, so
each and every remote peer has all the data. There is no need to merge
the remote data with local data and send back the merged data back to
remote peers.

Refs: #1617
2016-09-26 06:28:51 +08:00
Asias He
99e77e8ec2 repair: Do not abort the repair when one range is failed
failed_ranges is added to track the ranges that fail during repair.
2016-09-26 06:28:51 +08:00
Asias He
81c98ff3d9 repair: Reduce stream_plan usage
Right now, we are using one stream_plan for each range of a column
family. This generates tons of stream_plans and stream_sessions. Each
stream_plan can transfer multiple ranges and column families. We can
use a single stream_plan to stream datas for multiple ranges and column
families, so that 1) overhead of stream_plan/session negotiation is
reduced 2) it is much easier to debug/monitor few stream_sessions

Fixes #1685
2016-09-26 06:28:50 +08:00
Asias He
a0020fdad2 stream_session: Allow adding ranges to a cf more than once
Append the ranges to a stream_transfer_task if the cf is already added to
_transfers in add_transfer_ranges.
2016-09-26 06:28:50 +08:00
Asias He
576e15532f streaming: Add append_ranges for stream_transfer_task
Allow to append more ranges to transfer for a stream transfer task.
2016-09-26 06:28:50 +08:00
Avi Kivity
3057ca05bc Merge "Improve loggging when nodes are decommissioned" from Asias
"When a node is decommissioned, its gossip state will not be removed from gossip
immediately. It will only be removed 3 days later which helps nodes that were
down when the node was decommissioned to know decommission later when they are
up again.

This series improves the logging to reduce confusion when a node tries to
talking to a decommissioned node. In addition, we now do not try to talk to the
decommissioned in the unreachable_endpoints gossip round.

Fixes #1615"

* tag 'asias/loggging_decommissioned_nodes/v1' of github.com:cloudius-systems/seastar-dev:
  gossip: Make two log items debug level
  gossip: Print node status when node is UP or DOWN
  gossip: Ignore the node which is decommissioned in gossip round
  gossip: Print convict debug info only when the node is alive
  gossip: Add more timing log in add_expire_time_for_endpoint
  streaming: Print on_remove and on_restart log when peer exists
  streaming: Introduce has_peer in stream_manager
2016-09-25 15:19:13 +03:00
Asias He
830f4ee353 gossip: Make two log items debug level
It is duplciated with "InetAddresss x.x.x.x is now UP" message.

INFO  2016-09-23 10:35:15,512 [shard 0] gossip - Node 127.0.0.1 has restarted, now UP, status = NORMAL
INFO  2016-09-23 10:35:15,513 [shard 0] gossip - InetAddress 127.0.0.1 is now UP, status = NORMAL

Make the log a bit cleaner.
2016-09-25 07:17:19 +08:00
Asias He
a26a26963c gossip: Print node status when node is UP or DOWN
For example:

gossip - InetAddress 127.0.0.4 is now UP, status = NORMAL
gossip - InetAddress 127.0.0.3 is now DOWN, status = LEFT
gossip - InetAddress 127.0.0.1 is now DOWN, status = shutdown
2016-09-25 07:17:19 +08:00
Asias He
1d9401d080 gossip: Ignore the node which is decommissioned in gossip round
If the node is decommissioned, there is no point to try to contact it
again in the gossip round.
2016-09-25 07:17:19 +08:00
Asias He
4b73443222 gossip: Print convict debug info only when the node is alive 2016-09-25 07:17:19 +08:00
Asias He
99a2ae0fb5 gossip: Add more timing log in add_expire_time_for_endpoint
It tells when the node is expected to expire and how many seconds are
left.
2016-09-25 07:17:19 +08:00
Asias He
40f7a355a0 streaming: Print on_remove and on_restart log when peer exists
We print the following messages even if there is no stream_session with
that peer. It is a bit confusing.

  INFO  2016-09-23 08:26:37,254 [shard 0] stream_session - stream_manager:
  Close all stream_session with peer = 127.0.0.1 in on_restart

  INFO  2016-09-23 08:26:37,287 [shard 0] stream_session - stream_manager:
  Close all stream_session with peer = 127.0.0.3 in on_remove

Print only when the streaming session with the peer exists.
2016-09-25 07:17:19 +08:00
Asias He
2ac4ce77a9 streaming: Introduce has_peer in stream_manager
It is used to query if a streaming peer with inet_address exists.
2016-09-25 07:17:13 +08:00
Nadav Har'El
fe1ba753ce Avoid semaphore's default initial value
The fact that Seastar's semaphore has a default initializer of 1 if not
explicitly initialized is confusing and unexpected and recently lead to
two bugs. So ScyllaDB should not rely on this default behavior, and specify
the initial value of each semaphore explicitly.

In several cases in the ScyllaDB code, the explict initialization was
missing, and this patch adds it. In one case (rate_limiter) I even think
the default of 1 was a bit strange, and 0 makes more sense.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <1474530745-23951-1-git-send-email-nyh@scylladb.com>
2016-09-24 19:25:02 +03:00
Paweł Dziepak
eb59b4c4ab keys: disable constructing from generic range
stdx::optional<T> uses quite elaborate std::enable_if_t magic to decide
whether the argument passed to its constructor should be used for a call
T constructor or stdx::optional<T> constructor.

Apparently, with GCC 6.2 having T constructor which accepts any type
confuses that magic and we end up with compile errors.

The solution is to have from_range() method that replaces that
constructor from range. There is also constructor that creates a key
from std::vector<bytes> so that code generated by IDL works as it did
before.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Message-Id: <1474550971-15309-1-git-send-email-pdziepak@scylladb.com>
2016-09-24 18:57:01 +03:00
Raphael S. Carvalho
cfe7419f0f sstables: update or remove some outdated comments
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <74bae503447da2544a005e29b7d3aafa9f6e8c90.1474383273.git.raphaelsc@scylladb.com>
2016-09-24 18:53:19 +03:00
Raphael S. Carvalho
0f1bd3c527 db: fix clustering key filter
When date tiered strategy is enabled, filter_sstable_for_reader()
was returning more sstables than needed because the return type
of serialized_tri_compare::operator() was wrong, which results
in bad performance.

tgrabiec: Refs #1449

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <0301d7588e33c7bbb8cd80fed20a1827926a8fff.1474585088.git.raphaelsc@scylladb.com>
2016-09-23 12:33:58 +02:00
Asias He
e352570f52 conf: Move initial_token to supported section in scylla.yaml
initial_token is actually supported

Fixes #1686
Message-Id: <465da088696f72a3a7bcf19ba8e4895a0a648e7c.1474512235.git.asias@scylladb.com>
2016-09-23 09:34:05 +03:00
Tomasz Grabiec
0b0d126721 Merge seastar upstream
Fixes #1622.
Fixes #1690.

* seastar 40a68fa...2b55789 (5):
  > input_stream: Fix possible infinite recursion in consume()
  > iostream: Fix stack overflow in output_stream::split_and_put()
  > condition_variable: fix spurious wakeup
  > Merge "assorted rpc fixes" from Gleb
  > Merge "Simple fixes for doxygen" from Glauber
2016-09-22 14:27:52 +02:00
Amnon Heiman
a6749116a7 scylla_setup: Install node_exporter
This adds the option to install node_exporter during setup.

The node_exporter export server information in the prometheus API.
It should be used when using the scylla prometheus API to get the server
information.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2016-09-22 09:43:53 +03:00
Amnon Heiman
4e0dcb59e7 scylla.spec: package the node_exporter scripts
This patch adds the node_exporter related files to the rpm.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2016-09-22 09:43:53 +03:00
Amnon Heiman
3d242fdb4d Add a link to node_exporter_install
This adds a link to node_exporter_install in sbin, so it will be
availabe in the path.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2016-09-22 09:43:53 +03:00
Amnon Heiman
9d3edd3a28 service file for node_exporter with systemd
This patch adds a service file for OS that supports systemd.

When started, it would run an already installed node_exporter or fail.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2016-09-22 09:43:53 +03:00
Amnon Heiman
801b2c4914 An installation script for node_exporter
node_exporter is a utility that export node information via prometheus
API. It takes care of host related metrics such as CPU and memory.

The install script, download the node_exporter binaries, create a link
in /usr/bin.

On OS with systemd supported it would enable and start the installed
service file to start as a service. On others (ubuntu) it would create a conf file and start it.

The installation should be done using sudo.

After a successful installation, the node_exporter would run as a
service.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2016-09-22 09:33:03 +03:00
Paweł Dziepak
906250dcbd Merge "Enhance GDB script with new LSA-related commands" from Tomek 2016-09-21 13:22:00 +01:00
Raphael S. Carvalho
67343798cf api: implement api to return sstable count per level
'nodetool cfstats' wasn't showing per-level sstable count because
the API wasn't implemented.

Fixes #1119.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <0dcdf9196eaec1692003fcc8ef18c77d0834b2c6.1474410770.git.raphaelsc@scylladb.com>
2016-09-21 09:13:40 +03:00
Asias He
aa47265381 gossip: Fix std::out_of_range in setup_collectd
It is possible that endpoint_state_map does not contain the entry for
the node itself when collectd accesses it.

Fixes the issue:

Sep 18 11:33:16 XXX scylla[19483]: [shard 0] seastar - Exceptional
future ignored: std::out_of_range (_Map_base::at)

Fixes #1656

Message-Id: <8ffe22a542ff71e8c121b06ad62f94db54cc388f.1474377722.git.asias@scylladb.com>
2016-09-20 19:38:16 +03:00
Tomasz Grabiec
69aec3835f scylla-gdb: Enhance 'scylla ptr' to show if object is managed by LSA
Example:

  (gdb) scylla ptr 0x601000480000
  thread 1, large, LSA-managed

One can then use 'scylla lsa-segment 0x601000480000' to examine LSA
segment contents.
2016-09-20 16:53:23 +02:00
Tomasz Grabiec
486a92092b scylla-gdb: Add 'scylla segment-descs' command
Displays information about shard's segment descriptors. One can see
which segments belong to LSA, what's their occupancy, etc.

(gdb) scylla segment-descs
...
0x601000940000: lsa free=26092 region=0x60100036d890 zone=0x6010000fb420
0x601000980000: lsa free=26092 region=0x60100036d890 zone=0x6010000fb420
0x6010009c0000: lsa free=261153 region=0x60100036fcf0 zone=0x6010000fb420
0x601000a00000: std
0x601000a40000: lsa free=25508 region=0x60100036d890 zone=0x6010000fb420
0x601000a80000: std
0x601000ac0000: lsa free=26092 region=0x60100036d890 zone=0x6010000fb420
0x601000b00000: lsa free=26092 region=0x60100036d890 zone=0x6010000fb420
0x601000b40000: std
...
2016-09-20 16:53:23 +02:00
Tomasz Grabiec
b0b28696b5 scylla-gdb: Add 'scylla lsa-segment' command
Allows one to examine contents of LSA segment.

Example:

  (gdb) scylla lsa-segment 0x601000480000
  0x601000480e70: live size=144 migrator=standard_migrator<cache_entry>::object
  0x601000480f10: live size=144 migrator=standard_migrator<cache_entry>::object
  0x601000480fb0: free size=192
  0x60100048107e: free size=42
  0x6010004814e0: free size=192
  0x6010004815ae: free size=40
  0x6010004815e8: free size=192
  0x6010004816b8: live size=144 migrator=standard_migrator<cache_entry>::object
  0x601000481758: free size=192
  ...
2016-09-20 16:53:21 +02:00
Tomasz Grabiec
5011b77e15 scylla-gdb: Add std::vector wrapper
Makes vector values itearable from python level.
2016-09-20 16:53:20 +02:00
Pekka Enberg
42dd4670dc transport/server: Add CQL frame Snappy compression support
Fixes #1286
Message-Id: <1474370861-5928-1-git-send-email-penberg@scylladb.com>
2016-09-20 12:33:36 +01:00
Pekka Enberg
acc93509a2 transport/server: Fix CQL connection compression negotiation
Benoît Canet points out that CQL messages are not always compressed
although compression is enabled by the driver. Turns out our CQL
compression negotiation is broken. We need to negotiate compression upon
STARTUP message and not rely on the incoming request to have the
compression bit enabled.

Fixes #1680
Message-Id: <1474366693-3001-1-git-send-email-penberg@scylladb.com>
2016-09-20 11:19:27 +01:00