Commit Graph

9009 Commits

Author SHA1 Message Date
Pekka Enberg
227daecba6 Revert "dist: move setup scripts to /usr/sbin"
This reverts commit 989357189a because it
broke our Jenkins packaging jobs.
2016-03-29 10:17:05 +03:00
Pekka Enberg
d1ec97e76f Revert "dist: re-enable clocksource=tsc on AMI"
This reverts commit 050fb911d5 in
preparation for reverting 989357189a.
2016-03-29 10:16:48 +03:00
Takuya ASADA
050fb911d5 dist: re-enable clocksource=tsc on AMI
clocksource=tsc on boot parameter mistakenly dropped on b3c85aea89, need to re-enable.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1459180643-4389-1-git-send-email-syuu@scylladb.com>
2016-03-29 09:53:23 +03:00
Asias He
62d443a07d streaming: Fix log of plan_id and session address in stream_session
They are get swapped. Fix it up. Spotted by looking at the log.
Message-Id: <d163d71e9a96d1a45c3a4c529519790eeff7c486.1459172778.git.asias@scylladb.com>
2016-03-29 09:01:06 +03:00
Nadav Har'El
a05577ca41 sstable: fix read failure of certain sstables
We had a problem reading certain existing Cassandra sstables into
Scylla.

Our consume_range_tombstone() function assumes that the start and end
columns have a certain "end of component" markers, and want to verify
that assumption. But because of bugs in older versions of Cassandra,
see https://issues.apache.org/jira/browse/CASSANDRA-7593, sometimes the
"end of component" was missing (set to 0). CASSANDRA-7593 suggested
this problem might exist on the start column, so we allowed for that,
but now we discovered a case where also the end column is set to 0 -
causing the test in consume_range_tombstone() to fail and the sstable
read to fail - causing Scylla to no be able to import that sstable from
Cassandra. Allowing for an 0 also on the end column made it possible
to read that sstable, compact it, and so on.

Fixes #1125.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <1459173964-23242-1-git-send-email-nyh@scylladb.com>
2016-03-28 17:09:37 +03:00
Duarte Nunes
db881fdc8f cql: Add support for pg-style string literal
This patch adds support for pg-style string literals to the CQL
grammar.

Fixes #1078

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1459093238-2529-1-git-send-email-duarte@scylladb.com>
2016-03-28 17:06:03 +03:00
yan cui
e5d1c031ac dist: add ubuntu docker file 2016-03-28 10:14:12 +03:00
Avi Kivity
a919113fdb schema_tables: fix deadlock in cross-node communications
Seastar wrongly limits the number of concurrent submit_to()s to a single
remote shard.  This can cause an ABBA deadlock:

  fiberA                fiberB (x127)
  submit_to(0)                         # lock schema
  <- returns
                        submit_to(0)   # lock schema (waits)
  submit_to(0)                         # do work (waits)

The fiberBs wait for fiberA, which in turn waits for a fiberB to return.

While the correct fix is to remote the client-side limit and replace it
with a server-side per-verb limit, we start with a simpler fix that
replaces the blocking lock call with a non-blocking call, removing the
deadlock.

Fixes #1088.

Message-Id: <1459095357-28950-1-git-send-email-avi@scylladb.com>
2016-03-28 10:12:10 +03:00
Raphael Carvalho
e6e5999282 Fix corner-case in refresh
Problem found by dtest which loads sstables with generation 1 and 2 into an
empty column family. The root of the problem is that reshuffle procedure
changes new sstables to start from generation 2 at least. So reshuffle could
try to set generation 1 to 2 when generation 2 exists.
This problem can be fixed by starting from generation 1 instead, so reshuffle
would handle this case properly.

Fixes #1099.

Signed-off-by: Raphael Carvalho <raphaelsc@scylladb.com>
Message-Id: <88c51fbda9557a506ad99395aeb0a91cd550ede4.1458917237.git.raphaelsc@scylladb.com>
2016-03-27 10:03:32 +03:00
Avi Kivity
077c0d1022 dist: ami: fix AMI_OPT receiving no value
We assign AMI=0 and AMI_OPT=1, so in the true case, AMI_OPT has no value,
and a later compare fails.
2016-03-26 21:16:28 +03:00
Takuya ASADA
989357189a dist: move setup scripts to /usr/sbin
Since these scripts are user command, should be on $PATH.

Fixes #1092

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1458860407-25269-1-git-send-email-syuu@scylladb.com>
2016-03-25 11:50:13 +03:00
Takuya ASADA
2582dbe4a0 dist/ami: use tilde for release candidate builds
Sync with ubuntu package versioning rule

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1458882718-29317-1-git-send-email-syuu@scylladb.com>
2016-03-25 11:34:28 +03:00
Glauber Costa
e750a94300 sanity check Seastar's I/O queue configuration
While Seastar in general can accept any parameter for its I/O queues, Scylla
in particular shouldn't run with them disabled. Such will be the status when
the max-io-requests parameter is not enabled.

On top of that, we would like to have enough depth per I/O queue not to allow
for shard-local parallelism. Therefore, we will require a minimum per-queue
capacity of 4. In machines where the disk iodepth is not enough to allow for 4
concurrent requests per shard, one should reduce the number of I/O queues.

For --max-io-requests, we will check the parameter itself. However, the
--num-io-queues parameter is not mandatory, and given enough concurrent
requests, Seastar's default configuration can very well just be doing the right
thing. So for that, we will check the final result of each I/O queue.

As it is the case with other checks of the sorts, this can be overridden by
the --developer-mode switch.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <63bf7e91ac10c95810351815bb8f5e94d75592a5.1458836000.git.glauber@scylladb.com>
2016-03-25 11:33:57 +03:00
Tomasz Grabiec
53bbcf4a1e schema_tables: Wait for notifications to be processed.
Listeners may defer since:

 93015bcc54 "migration_manager: Make the migration callbacks runs inside seastar thread"

Not all places were adjusted to wait for them. Fix that.

Message-Id: <1458837613-27616-1-git-send-email-tgrabiec@scylladb.com>
2016-03-24 19:04:12 +02:00
Avi Kivity
12744217b8 Initial github issue template
Message-Id: <1458817106-1513-1-git-send-email-avi@scylladb.com>
2016-03-24 15:37:00 +02:00
Benoît Canet
4ac1126677 collectd: Write to the network to get rid of spurious log messages
Closes #1018

Suggested-by: Avi Kivity <avi@scylladb.com>
Signed-of-by: Benoît Canet <benoit@scylladb.com>
Message-Id: <1458759378-4935-1-git-send-email-benoit@scylladb.com>
2016-03-24 12:34:14 +02:00
Calle Wilund
ff5df306e3 database: Use disk-marking delete function in discard_sstables
Fixes #797

To make sure an inopportune crash after truncate does not leave
sstables on disk to be considered live, and thus resurrect data,
after a truncate, use delete function that renames the TOC file to
make sure we've marked sstables as dead on disk when we finish
this discard call.
Message-Id: <1458575440-505-2-git-send-email-calle@scylladb.com>
2016-03-24 12:02:08 +02:00
Calle Wilund
4e52b41a46 sstables: Add delete func to rename TOC ensuring table is marked dead
Note: "normal" remove_by_toc_name must now be prepared for and check
if the TOC of the sstable is already moved to temp file when we
get to the juicy delete parts.
Message-Id: <1458575440-505-1-git-send-email-calle@scylladb.com>
2016-03-24 12:01:53 +02:00
Asias He
6fd6e57e80 streaming: Harden keep alive timer
- Do nothing in case the session is closed, to prevent we fire up the
  timer again

- Print log info when no progress has been made if the time expires, it
  is very useful to debug a idle session

- Grab a reference when the keep alive timer is running

Message-Id: <9f2cc3164696905a6a39c0d072a980765d598dfd.1458782956.git.asias@scylladb.com>
2016-03-24 11:58:54 +02:00
Avi Kivity
112a930f92 Merge "Bring back simplify session completion logic" from Asias
"The following patches are reverted becasue they were thought they break
Glauber's "Make sure repairs do not cripple incoming load" series. It turns out
these two patches just made another bug more visisble. The bug is fixed in
c2eff7e824 (streaming: Complete receive task
after the flush). We can bring the two patches back now.

Passed repair_additional_test.py and update_cluster_layout_tests.py with smp 2."
2016-03-24 11:57:20 +02:00
Tomasz Grabiec
341b509f68 cql_test_env: Make initialization exception-safe
Currently start() is not prepared to handle exceptions thrown from
service initialization. It's easy to trigger such exceprion by
starting two tests at the same time, which will result in socket bind
error.

Exception thrown from start() typically results in assertion failures
like this one:

  seastar::sharded<Service>::~sharded() [with Service = database]: Assertion `_instances.empty()' failed.

This patch fixes the problem by combining start() and stop() in a
single do_with() and using RAII for stopping services.

Now exceptions thrown from service initialization should stop services
in proper order and let the original exception to pass
through. Example result:

  fatal error in "test_new_schema_with_no_structural_change_is_propagated": std::runtime_error: bind: Address already in use
Message-Id: <1458768018-27662-1-git-send-email-tgrabiec@scylladb.com>
2016-03-24 11:20:01 +02:00
Shlomi Livne
d3a91e737b fix a collision betwen --ami command line param and env
sysconfig scylla-server includes an AMI, the script also used an AMI
variable fix this by renaming the script variable

6a18634f9f introduced this issue since it
started imported the sysconfig scylla-server

Signed-off-by: Shlomi Livne <shlomi@scylladb.com>
Message-Id: <0bc472bb885db2f43702907e3e40d871f1385972.1458767984.git.shlomi@scylladb.com>
2016-03-24 08:14:41 +02:00
Asias He
fe263e5436 Revert "Revert "streaming: Start to send mutations after PREPARE_DONE_MESSAGE""
This reverts commit 1f29a698d5.
2016-03-24 08:43:17 +08:00
Asias He
a6dd6e6d55 Revert "Revert "streaming: Simplify session completion logic""
This reverts commit 354fca9d56.
2016-03-24 07:48:27 +08:00
Gleb Natapov
0afd1c6f0a config: enable truncate_request_timeout_in_ms option
Option truncate_request_timeout_in_ms is used by truncate. Mark it as
used.

Message-Id: <20160323162649.GH2282@scylladb.com>
2016-03-23 18:50:24 +02:00
Yoav Kleinberger
91269d0c15 tools/scyllatop: add sums to aggregate view
the aggregate view now supports both sums and means.

Signed-off-by: Yoav Kleinberger <yoav@scylladb.com>
Message-Id: <1328af8efb113a786d7402b0704220108bfb28db.1458749600.git.yoav@scylladb.com>
2016-03-23 18:49:57 +02:00
Shlomi Livne
6a18634f9f scylla_io_setup import scylla-server env args
scylla_io_seup requires the scylla-server env to be setup to run
correctly. previously scylla_io_setup was encapsulated in
scylla-io.service that assured this.

extracting CPUSET,SMP from SCYLLA_ARGS as CPUSET is needed for invoking
io_tune

Signed-off-by: Shlomi Livne <shlomi@scylladb.com>
Message-Id: <d49af9cb54ae327c38e451ff76fe0322e64a5f00.1458747527.git.shlomi@scylladb.com>
2016-03-23 17:54:06 +02:00
Pekka Enberg
8bf3d4f550 Merge "Make sure repairs do not cripple incoming load" from Glauber
"This series makes sure that the influence of repairs on the ongoing loads is
limited. This patch does not fix the situation completely, but it will be the
best we can do for 1.0

Here's a brief explanation about some potentially contentions points, and future work:

1) With the old parallelism semaphore in tree, we could never really drop parallelism
below 256, since even with (local) parallelism = 1, we would still have 256 vnodes. So while
the number 100 is totally empirical, we know for a fact that around 200-something, we
start having real trouble. (total) parallelism = 100 is enough to allow us to survive
a load as much as 3 times heavier than the load described in Issue944. So while it is
empirical, at least it is based on something

2) I totally support changing the checksumming algorithm. However, I would rather focus
my efforts on testing this to exhaustion than doing this at the moment. But if anybody
wants to do it, I think it is a great thing to have before 1.0. Specially because
we'll probably need a new verb for that, so we would be better off having it from the start

3) This problem was made harder due to the fact that there are three conditions really that
can affect the ongoing load. Only one of them needs to trigger for us to see degradation, so
fixing them individually will usually buy us nothing.

Those are:
    a) The disk bandwidth. Since the mutations are all together in the same memtable/commitlog
       as normal memtables, we can differentiate between them from the I/O Scheduler perspective.
       This is not an issue of course if the incoming mutations are not enough for us to saturate
       the disk, but specially given the highly parallel nature of repair, we usually will. If
       the commitlog queue starts getting too big, for instance, new requests will start being
       put to wait. The effect of this part of the series is to *completely* shift the high waiting
       times from those classes to the streaming ones (unfortunately compaction is still affected,
       but that's fine IMHO). With the new streaming classes, the waiting time of a memtable / commitlog
       requests is still kept in the microseconds range. The streaming classes, on the other hand,
       will be in the hundreds of milliseconds range, or even seconds.

    b) The memory consumption: since the whole problem that leads to a) is the fact that due to high
       disk activity some requests will have to wait, we will end up with a lot of streaming memtables
       not yet flushed. Because of that, we will start throttling new incoming CQL requests and all
       the isolation efforts are rendered useless. Once again, due to the highly parallel nature of
       repair, this turned out to be a very easy condition to trigger. The solution proposed here
       is to limit a maximum amount of dirty memory for the repair job (in here, 25 %). This way,
       we can endure even slightly heavier loads without sweating too much.

    c) The task scheduler: repair generates a ton of requests for range checksums, and we actually
       want to keep it that way - so that the ranges checksummed are small enough so we don't have
       to resend a lot of mutations for no reason. However, if we pile up thousands of continuations
       in the task scheduler, seastar has absolutely no mechanism (right now) to prioritize between
       different kinds of requests. That means that the continuations that are supposed to be handling
       user requests will simply not for a long time. Even if the Seastar load is less than 100 %
       that is still a problem, since that is just adding hundreds of milliseconds worth of latencies to
       any request processing.

Fixes #944 and fixes #1033."
2016-03-23 16:07:06 +02:00
Yoav Kleinberger
d2cfb86dc8 tools/scyllatop: defend against unexpected strings from collectd
Signed-off-by: Yoav Kleinberger <yoav@scylladb.com>
Message-Id: <cd7ecf6b3b82bd2027179cbec4e689a946469e9a.1458740337.git.yoav@scylladb.com>
2016-03-23 16:05:59 +02:00
Asias He
c2eff7e824 streaming: Complete receive task after the flush
A STREAM_MUTATION_DONE message will signal the receiver that the sender
has completed the sending of streams mutations. When the receiver finds
it has zero task to send and zero task to receive, it will finish the
stream_session, and in turn finish the stream_plan if all the
stream_sessions are finished. We should call receive_task_completed only
after the flush finishes so that when stream_plan is finshed all the
data is on disk.

Fixes repair_disjoint_data_test issue with Glauber's "[PATCH v4 0/9] Make
sure repairs do not cripple incoming load" serries

======================================================================
FAIL: repair_disjoint_data_test
(repair_additional_test.RepairAdditionalTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "scylla-dtest/repair_additional_test.py",
line 102, in repair_disjoint_data_test
    self.check_rows_on_node(node1, 3000)
  File "scylla-dtest/repair_additional_test.py",
line 33, in check_rows_on_node
    self.assertEqual(len(result), rows, len(result))
AssertionError: 2461
2016-03-23 09:40:49 -04:00
Glauber Costa
f49e965d78 repair: rework repair code so we can limit parallelism
The repair code as it is right now is a bit convoluted: it resorts to detached
continuations + do_for_each when calling sync_ranges, and deals with the
problem of excessive parallelism by employing a semaphore inside that range.

Still, even by doing that, we still generate a great number of
checksum requests because the ranges themselves are processed in parallel.

It would be better to have a single-semaphore to limit the overall parallelism
for all requests.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-03-23 09:40:49 -04:00
Glauber Costa
34a9fc106f database: keep streaming memtables in their own region group
Theoretically, because we can have a lot of pending streaming memtables, we can
have the database start throttling and incoming connections slowing down during
streaming.

Turns out this is actually a very easy condition to trigger. That is basically
because the other side of the wire in this case is quite efficient in sending
us work. This situation is alleviated a bit by reducing parallelism, but not
only it does't go away completely, once we have the tools to start increasing
parallelism again it will become common place.

The solution for this is to limit the streaming memtables to a fraction of the
total allowed dirty memory. Using the nesting capability built in in the LSA
regions, we will make the streaming region group a child of the main region
group.  With that, we can throttle streaming requests separately, while at the
same time being able to control the total amount of dirty memory as well.

Because of the property, it can still be the case that incoming requests will
throttle earlier due to streaming - unless we allow for more dirty memory to be
used during repairs - but at least that effect will be limited.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-03-23 09:40:47 -04:00
Glauber Costa
455d5a57d2 streaming memtables: coalesce incoming writes
The repair process will potentially send ranges containing few mutations,
definitely not enough to fill a memtable. It wants to know whether or not each
of those ranges individually succeeded or failed, so we need a future for each.

Small memtables being flushed are bad, and we would like to write bigger
memtables so we can better utilize our disks.

One of the ways to fix that, is changing the repair itself to send more
mutations at a single batch. But relying on that is a bad idea for two reasons:

First, the goals of the SSTable writer and the repair sender are at odds. The
SSTable writer wants to write as few SSTables as possible, while the repair
sender wants to break down the range in pieces as small as it can and checksum
them individually, so it doesn't have to send a lot of mutations for no reason.

Second, even if the repair process wants to process larger ranges at once, some
ranges themselves may be small. So while most ranges would be large, we would
still have potentially some fairly small SSTables lying around.

The best course of action in this case is to coalesce the incoming streams
write-side.  repair can now choose whatever strategy - small or big ranges - it
wants, resting assure that the incoming memtables will be coalesced together.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-03-23 09:38:22 -04:00
Glauber Costa
5fa866223d streaming: add incoming streaming mutations to a different sstable
Keeping the mutations coming from the streaming process as mutations like any
other have a number of advantages - and that's why we do it.

However, this makes it impossible for Seastar's I/O scheduler to differentiate
between incoming requests from clients, and those who are arriving from peers
in the streaming process.

As a result, if the streaming mutations consume a significant fraction of the
total mutations, and we happen to be using the disk at its limits, we are in no
position to provide any guarantees - defeating the whole purpose of the
scheduler.

To implement that, we'll keep a separate set of memtables that will contain
only streaming mutations. We don't have to do it this way, but doing so
makes life a lot easier. In particular, to write an SSTable, our API requires
(because the filter requires), that a good estimate on the number of partitions
is informed in advance. The partitions also need to be sorted.

We could write mutations directly to disk, but the above conditions couldn't be
met without significant effort. In particular, because mutations can be
arriving from multiple peer nodes, we can't really sort them without keeping a
staging area anyway.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-03-23 09:13:00 -04:00
Glauber Costa
10c8ca6ace priority manager: separate streaming reads from writes
Streaming has currently one class, that can be used to contain the read
operations being generated by the streaming process. Those reads come from two
places:

- checksums (if doing repair)
- reading mutations to be sent over the wire.

Depending on the amount of data we're dealing with, that can generate a
significant chunk of data, with seconds worth of backlog, and if we need to
have the incoming writes intertwined with those reads, those can take a long
time.

Even if one node is only acting as a receiver, it may still read a lot for the
checksums - if we're talking about repairs, those are coming from the
checksums.

However, in more complicated failure scenarios, it is not hard to imagine a
node that will be both sending and receiving a lot of data.

The best way to guarantee progress on both fronts, is to put both kinds of
operations into different classes.

This patch introduces a new write class, and rename the old read class so it
can have a more meaningful name.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-03-23 09:12:59 -04:00
Glauber Costa
78189de57f database: make seal_on_overflow a method of the memtable_list
Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-03-23 09:12:59 -04:00
Glauber Costa
635bb942b2 database: move add_memtable as a method of the memtable_list
The column family still has to teach the memtable list how to allocate a new memtable,
since it uses CF parameters to do so.

After that, the memtable_list's constructor takes a seal and a create function and is complete.
The copy constructor can now go, since there are no users left.
The behavior of keeping a reference to the underlying memtables can also go, since we can now
guarantee that nobody is keeping references to it (it is not even a shared pointer anymore).
Individual memtables are, and users may be keeping references to them individually.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-03-23 09:12:59 -04:00
Glauber Costa
6ba95d450f database: move active_memtable to memtable_list
Each list can have a different active memtable. The column family method keeps
existing, since the two separate sets of memtable are just an implementation
detail to deal with the problem of streaming QoS: *the* active memtable keeps
being the one from the main list.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-03-23 09:12:59 -04:00
Glauber Costa
af6c7a5192 database: create a class for memtable_list
memtable_list is currently just an alias for a vector of memtables.  Let's move
them to a class on its own, exporting the relevant methods to keep user code
unchanged as much as possible.

This will help us keeping separate lists of memtables.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-03-23 09:12:59 -04:00
Avi Kivity
8ed95754c0 Merge seastar upstream
* seastar 9f2b868...aa281bd (7):
  > shared_promise: Add move assignment operator
  > lowres_clock: Fix stretched time
  > scripts: Delete tap with ip instead of tunctl
  > vla: Actually be exception-safe
  > vla: Ensure memory is freed if ctor throws
  > vla: Ensure memory is correctly freed
  > net: Improve error message when parsing invalid ipv4 address
2016-03-23 14:39:31 +02:00
Takuya ASADA
50db64de33 dist: drop -j2 option on .spec, make build_rpm.sh able to specify -j option
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1458678665-30273-1-git-send-email-syuu@scylladb.com>
2016-03-23 13:32:14 +02:00
Gleb Natapov
48c83163b9 init: make more initialization threaded
Since initialization now runs in a thread storage, messaging and
gossiper services initialization code may take advantage of it too.

Message-Id: <20160323094732.GF2282@scylladb.com>
2016-03-23 11:53:11 +02:00
Shlomi Livne
4ecc37111f dist/ami: Use the actual number of disks instead of AWS meta service
We have seen in some cases that when using the boto api to start
instances the aws metadata service
http://169.254.169.254/latest/meta-data/block-device-mapping/ returns
incorrect number of disks - workaround that by checking the actual
number of disks using lsblk

Adding a validation at the end verifying that after all computations the
NR_IO_QUEUES will not be greater then the number of shards (we had an
issue with i2.8x)

Fixes: #1062

Signed-off-by: Shlomi Livne <shlomi@scylladb.com>
Message-Id: <54c51cd94dd30577a3fe23aef3ce916c01e05504.1458721659.git.shlomi@scylladb.com>
2016-03-23 10:47:08 +02:00
Raphael Carvalho
370b1336fe service: fix refresh
Vlad and I were working on finding the root of the problems with
refresh. We found that refresh was deleting existing sstable files
because of a bug in a function that was supposed to return the maximum
generation of a column family.
The intention of this function is to get generation from last element
of column_family::_sstables, which is of type std::map.
However, we were incorrectly using std::map::end() to get last element,
so garbage was being read instead of maximum generation.
If the garbage value is lower than the minimum generation of a column
family, then reshuffle_sstables() would set generation of all existing
sstables to a lower value. That would confuse our mechanism used to
delete sstables because sstables loaded at boot stage were touched.
Solution to this problem is about using rbegin() instead of end() to
get last element from column_family::_sstables.

The other problem is that refresh will only load generations that are
larger than or equal to X, so new sstables with lower generation will
not be loaded. Solution is about creating a set with generation of
live SSTables from all shards, and using this set to determine whether
a generation is new or not.

The last change was about providing an unused generation to reshuffle
procedure by adding one to the maximum generation. That's important to
prevent reshuffle from touching an existing SSTable.

Tested 'refresh' under the following scenarios:
1) Existing generations: 1, 2, 3, 4. New ones: 5, 6.
2) Existing generations: 3, 4, 5, 6. New ones: 1, 2.
3) Existing generations: 1, 2, 3, 4. New ones: 7, 8.
4) No existing generation. No new generation.
5) No existing generation. New ones: 1, 2.
I also had to adapt existing testcase for reshuffle procedure.

Fixes #1073.

Signed-off-by: Raphael Carvalho <raphaelsc@scylladb.com>
Message-Id: <1c7b8b7f94163d5cd00d90247598dd7d26442e70.1458694985.git.raphaelsc@scylladb.com>
2016-03-23 10:21:58 +02:00
Benoît Canet
1594bdd5bb dist/ubuntu: Fix the init script variable sourcing
The variable sourcing was crashing the init script on ubuntu.
Fix it with the suggestion from Avi.

Signed-off-by: Benoît Canet <benoit@scylladb.com>
Message-Id: <1458685099-1160-1-git-send-email-benoit@scylladb.com>
2016-03-23 09:03:17 +02:00
Tomasz Grabiec
5f44afa311 cql3: batch_statement: Execute statements sequentially
Currently we execute all statements in parallel, but some statements
depend on order, in particular list append/prepend. Fix by executing
sequentially.

Fixes cql_additional_tests.py:TestCQL.batch_and_list_test dtest.

Fixes #1075.

Message-Id: <1458672874-4749-1-git-send-email-tgrabiec@scylladb.com>
2016-03-22 20:59:40 +02:00
Pekka Enberg
354fca9d56 Revert "streaming: Simplify session completion logic"
This reverts commit 208b7fa7ba. It breaks
Glauber's upcoming repair series.
2016-03-22 20:37:50 +02:00
Pekka Enberg
1f29a698d5 Revert "streaming: Start to send mutations after PREPARE_DONE_MESSAGE"
This reverts commit 4c06221766. It breaks
Glauber's upcoming repair series.
2016-03-22 20:37:22 +02:00
Avi Kivity
7df21768d6 Merge "Fix row_cache_alloc_stress test" from Tomasz
"The test predates LSA zones and was not anticipating that LSA would
take much more free memory from the system than it needs in its assertions.
Fix by accounting for the fact properly."
2016-03-22 18:46:31 +02:00
Avi Kivity
b8f80bb2be Update scylla-ami submodule
* dist/ami/files/scylla-ami 56f1ab7...89e7436 (1):
  > Merge "iotune packaging fix for scylla-ami" from Takuya
2016-03-22 17:55:00 +02:00