Commit Graph

390 Commits

Author SHA1 Message Date
Avi Kivity
42dc29619d Merge "Optimize mutation copies and moves" from Paweł
"This series deals with copies and moves of mutation. The former are dealt
with by adding std::move() and missing 'mutable' (in case of lambdas). The
latter are improved by storing mutation_partition externally thus removing
the need for moving mutation_partition each time mutation is moved.

Storing mutation_partition externally is obviously trading the cost of
move constructor for the cost of allocation which shows in perf_mutation
results since mutations aren't moved in that test.

perf_mutation (-c 1):
before: 3289520.06 tps
after:  3183023.37 tps
diff: -3.24%

perf_simple_query (read):
before: 526954.05 tps
after:  577225.16 tps
diff +9.54%

perf_simple_query (write):
before: 731832.70 tps
after:  734923.60 tps
diff: +0.42%

Fixes #150 (well, not completely)."
2015-09-03 12:05:28 +03:00
Paweł Dziepak
830a86258b db: avoid copying mutations
Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
2015-09-03 10:30:32 +02:00
Paweł Dziepak
ddec2b4d09 batchlog_manager: pass mutations by const ref
Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
2015-09-03 10:30:29 +02:00
Paweł Dziepak
8188896eb7 schema_tables: add missing mutable
Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
2015-09-03 10:30:25 +02:00
Glauber Costa
28f315fad4 system_keyspace: keep msg alive when needed
Fixes #266

Some callsites are fine: if we just get the message and process it, as is the
case with check_health for instance, msg will be alive and all is good. But if
we return a future inside the processing, msg must be kept alive. Classic bug,
appearing again.

Pekka saw this in practice in another bug. We haven't seen anything that is
related to this, but it is certainly wrong.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-09-03 09:11:07 +03:00
Pekka Enberg
ce39f9d57a db/system_keyspace: Fix use-after-free in build_dc_rack_info()
Fixes #264.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-09-02 16:37:34 +03:00
Calle Wilund
d95101664d Commitlog: Don't throw exceptions on unrecognized files in CL dir 2015-09-01 14:23:03 +02:00
Calle Wilund
1814f89730 Commitlog: Add some more metrics + accessors for json API
Fixes #99

Adding missing commitlog metrics to the rest API.

v2: Mis-send (clumsy fingers)
v3: Use map_reduce0 + subroutine for nicer code
v4: rebased on current master
v5: rebased yet again.

Since the _second_ file in this previous patch set was commited, and is
dependent on this very change below to even compile, some expediency might be
warranted.
2015-09-01 10:15:33 +03:00
Calle Wilund
9ba84e458a Commitlog: Handle partial writes in segment::cycle
* Fixes #247
* Re-introduce test_allocation_failure, but allow for the "failure" to not
  happen. I.e. if run with low memory settings, the test will check that
  allocation failure is graceful. With lots of memory it will check partial
  write.
2015-08-31 20:02:05 +03:00
Calle Wilund
d3a01072af CommitLogReplayer: Java -> C++
Initial implementation
2015-08-31 14:29:50 +02:00
Calle Wilund
bbf82e80d0 Commitlog: Allow skipping X bytes in commit log reader
Also refactor reader into named methods for debugging sanity.
2015-08-31 14:29:49 +02:00
Calle Wilund
da9ea641e5 Commitlog: Handle full paths in descriptor file name parse. 2015-08-31 14:29:48 +02:00
Calle Wilund
02d2bef1f2 Commitlog: Expose convinience method "list_existing_segments" 2015-08-31 14:29:48 +02:00
Calle Wilund
19052b3c09 Commitlog: Expose list_existing_descriptors 2015-08-31 14:29:48 +02:00
Calle Wilund
e068ffb5a5 Commitlog: Make file reader provide replay_position for entries 2015-08-31 14:29:47 +02:00
Calle Wilund
41b1ad8600 Commitlog: Make descriptor type visible/usable from outside 2015-08-31 14:29:47 +02:00
Calle Wilund
ea38b223bd Commitlog: change the ID generation scheme
* Make it more like origin, i.e. based on wall clock time of app start
* Encode shard ID in the, RP segement ID, to ensure RP:s and segement names
  are unique per shard
2015-08-31 14:29:46 +02:00
Calle Wilund
0fcf7e3e91 Commitlog: Make "position" type 32-bit to align replay_position with
Origin

* Note: removed commitlog_test:test_allocation_failure because with 
  segments limited to 4GB -> mutation limited to 2GB, actually forcing
  a fail is not guaranteed or even likely.
2015-08-31 14:29:44 +02:00
Calle Wilund
3f1a91b89c Commitlog: do not eagerly create first segment on init
Deferring makes it easier to separate old segments from new, which in turn
helps replay logic.
2015-08-31 13:11:44 +02:00
Avi Kivity
5f62f7a288 Revert "Merge "Commit log replay" from Calle"
Due to test breakage.

This reverts commit 43a4491043, reversing
changes made to 5dcf1ab71a.
2015-08-27 12:39:08 +03:00
Asias He
80c996a315 db/system_keyspace: Fix get_local_host_id
Before:
host_id in system.local is empty

After:
host_id in system.local is inserted correctly

This fixes a hasty problem that we always get a new host_id when
booting up a node with data.
2015-08-27 11:01:07 +03:00
Avi Kivity
43a4491043 Merge "Commit log replay" from Calle
"Initial implementation/transposition of commit log replay.

* Changes replay position to be shard aware
* Commit log segment ID:s now follow basically the same scheme as origin;
  max(previous ID, wall clock time in ms) + shard info (for us)
* SStables now use the DB definition of replay_position.
* Stores and propagates (compaction) flush replay positions in sstables
* If CL segments are left over from a previous run, they, and existing
  sstables are inspected for high water mark, and then replayed from
  those marks to amend mutations potentially lost in a crash
* Note that CPU count change is "handled" in so much that shard matching is
  per _previous_ runs shards, not current.

Known limitations:
* Mutations deserialized from old CL segments are _not_ fully validated
  against existing schemas.
* System::truncated_at (not currently used) does not handle sharding afaik,
  so watermark ID:s coming from there are dubious.
* Mutations that fail to apply (invalid, broken) are not placed in blob files
  like origin. Partly because I am lazy, but also partly because our serial
  format differs, and we currently have no tools to do anything useful with it
* No replay filtering (Origin allows a system property to designate a filter
  file, detailing which keyspace/cf:s to replay). Partly because we have no
  system properties.

There is no unit test for the commit log replayer (yet).
Because I could not really come up with a good one given the test
infrastructure that exists (tricky to kill stuff just "right").
The functionality is verified by manual testing, i.e. running scylla,
building up data (cassandra-stress), kill -9 + restart.
This of course does not really fully validate whether the resulting DB is
100% valid compared to the one at k-9, but at least it verified that replay
took place, and mutations where applied.
(Note that origin also lacks validity testing)"
2015-08-27 10:53:36 +03:00
Glauber Costa
391eea564e system_tables: implement load_host_id
A simple translation from the original code.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-25 19:16:30 -05:00
Glauber Costa
0fd2861293 system_tables: implement load_tokens
A simple translation from the original code

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-25 19:16:30 -05:00
Calle Wilund
2a1c7d2587 CommitLogReplayer: Java -> C++
Initial implementation
2015-08-25 09:41:56 +02:00
Calle Wilund
86a97fea4c Commitlog: Allow skipping X bytes in commit log reader
Also refactor reader into named methods for debugging sanity.
2015-08-25 09:41:55 +02:00
Calle Wilund
37cfc09e91 Commitlog: Handle full paths in descriptor file name parse. 2015-08-25 09:41:55 +02:00
Calle Wilund
4364d72ca3 Commitlog: Expose convinience method "list_existing_segments" 2015-08-25 09:41:54 +02:00
Calle Wilund
a3a02968ab Commitlog: Expose list_existing_descriptors 2015-08-25 09:41:54 +02:00
Calle Wilund
fcb87471b9 Commitlog: Make file reader provide replay_position for entries 2015-08-25 09:40:53 +02:00
Calle Wilund
db6370ad87 Commitlog: Make descriptor type visible/usable from outside 2015-08-25 09:40:53 +02:00
Calle Wilund
4f24b9795e Commitlog: change the ID generation scheme
* Make it more like origin, i.e. based on wall clock time of app start
* Encode shard ID in the, RP segement ID, to ensure RP:s and segement names
  are unique per shard
2015-08-25 09:40:52 +02:00
Asias He
22ee468428 db/system_keyspace: Fix set_bootstrap_state
We set status to COMPLETED in join_token_ring

   set_bootstrap_state(db::system_keyspace::bootstrap_state::COMPLETED)

but

   cqlsh 127.0.0.$i -e "SELECT * from system.local;"

shows

    bootstrapped -> IN_PROGRESS

The static sstring state_name is the bad boy.
2015-08-24 18:54:42 +08:00
Avi Kivity
8a4648761c tests: make test cql environment use volatile system keyspace
Prevents hangs due to the database not being able to persist a memtable.

Tested-by: Asias He <asias@cloudius-systems.com>
2015-08-24 13:50:22 +03:00
Pekka Enberg
9476e3f19f db/index: Kill Java code
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-24 11:51:50 +03:00
Pekka Enberg
544c7936d8 db/commitlog: Kill Java code
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-24 11:51:49 +03:00
Calle Wilund
ac74dd6159 Commitlog: Make "position" type 32-bit to align replay_position with Origin 2015-08-24 10:05:44 +02:00
Calle Wilund
d50986ef31 Commitlog: do not eagerly create first segment on init
Deferring makes it easier to separate old segments from new, which in turn
helps replay logic.
2015-08-24 10:05:44 +02:00
Pekka Enberg
5dbf1baed4 db/composites: Kill Java code
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-24 10:56:30 +03:00
Avi Kivity
7b67b04822 db: wire up max memtable size configuration 2015-08-19 13:17:27 +03:00
Asias He
67953a65b6 db/system_keyspace: Stub load_host_ids 2015-08-18 17:06:03 +08:00
Asias He
ab40ab6c19 db/system_keyspace: Stub load_tokens 2015-08-18 17:06:02 +08:00
Asias He
7f98a89968 db/system_keyspace: Introduce init_local_cache 2015-08-18 17:06:02 +08:00
Glauber Costa
0177c7fed1 system keyspace: implement get_bootstrap_state
To avoid spreading the futures all over, we will resort to a cache with this,
the same way we did for the dc/rack information.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-17 11:03:37 -07:00
Glauber Costa
20590db87f system keyspace: implement set_bootstrap_state
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-17 11:03:37 -07:00
Glauber Costa
8a50534119 system keyspace: implement get_saved_tokens
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-17 11:03:37 -07:00
Glauber Costa
6a682d0e49 storage_service: futurize get_tokens
Because all its users are already futurized, this is actually an easy one.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-17 11:03:37 -07:00
Glauber Costa
bebb2abe4b system keyspace: factor out local_cache start code
It will now be used for other values as well.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-17 11:03:36 -07:00
Calle Wilund
8f0f4e7945 Commitlog: do more extensive dir entry probes to determine type
Since directory_entry "type" might not be set.
Ensuring that code does not remain future free or easy to read.

Fixes #157.
2015-08-17 16:56:31 +03:00
Vlad Zolotarov
4e55033dc9 db::config: improve a help output for --endpoint_snitch parameter
- Improve the output formating.
   - Comment out not supported snitches.

Fixes issue #124

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-08-16 12:55:34 +03:00