Commit Graph

53948 Commits

Author SHA1 Message Date
Avi Kivity
979b7116d0 Merge seastar upstream
* seastar 5efee51...23f4fae (2):
  > stream: use a smaller stick for failed producers
  > stream: properly set exception
2015-08-25 15:24:49 +03:00
Avi Kivity
ab113731e8 Merge seastar upstream
* seastar 6444208...5efee51 (1):
  > json2code: Add enum support in return type
2015-08-25 14:32:46 +03:00
Calle Wilund
2f8bda1364 Main: Do commit log replay at startup 2015-08-25 09:41:57 +02:00
Calle Wilund
2a1c7d2587 CommitLogReplayer: Java -> C++
Initial implementation
2015-08-25 09:41:56 +02:00
Calle Wilund
df8d7a8295 Database: Add "flush_all_memtables" 2015-08-25 09:41:56 +02:00
Calle Wilund
cfcfa34028 Compaction: propagate metadata replay position from compacted tables 2015-08-25 09:41:55 +02:00
Calle Wilund
71204648fb SStables: put memtable replay_position in metadata on write 2015-08-25 09:41:55 +02:00
Calle Wilund
86a97fea4c Commitlog: Allow skipping X bytes in commit log reader
Also refactor reader into named methods for debugging sanity.
2015-08-25 09:41:55 +02:00
Calle Wilund
37cfc09e91 Commitlog: Handle full paths in descriptor file name parse. 2015-08-25 09:41:55 +02:00
Calle Wilund
4364d72ca3 Commitlog: Expose convinience method "list_existing_segments" 2015-08-25 09:41:54 +02:00
Calle Wilund
a3a02968ab Commitlog: Expose list_existing_descriptors 2015-08-25 09:41:54 +02:00
Calle Wilund
fcb87471b9 Commitlog: Make file reader provide replay_position for entries 2015-08-25 09:40:53 +02:00
Calle Wilund
db6370ad87 Commitlog: Make descriptor type visible/usable from outside 2015-08-25 09:40:53 +02:00
Calle Wilund
5524da8f18 Database: do not create shard-specific dirs for commitlog
New ID scheme allows for a single dir for all segments from all shards.
2015-08-25 09:40:52 +02:00
Calle Wilund
4f24b9795e Commitlog: change the ID generation scheme
* Make it more like origin, i.e. based on wall clock time of app start
* Encode shard ID in the, RP segement ID, to ensure RP:s and segement names
  are unique per shard
2015-08-25 09:40:52 +02:00
Calle Wilund
366263d866 Commitlog test: remove some hardcoded assumptions on segment IDs
To enable changing the ID generation scheme.
2015-08-25 09:14:40 +02:00
Calle Wilund
45d07d2744 runtime: expose boot_time
(boot == app start, I did not rename the var).
2015-08-25 09:14:40 +02:00
Calle Wilund
0ae7707106 SStables: Use db::commitlog::replay_position (not own type) 2015-08-25 09:14:39 +02:00
Avi Kivity
185a94135b Merge "export bloom filter statistics through management API" from Glauber
"This patchset implements the functions requested in Issue 95.
They are now made available through the management interface."

Fixes #95.
2015-08-25 08:57:30 +03:00
Glauber Costa
ca2d058520 api/column family: bloom filter file size
Export information about on-disk space used by bloom filters.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-24 20:14:31 -05:00
Glauber Costa
3dc135c380 api/column family: bloom filter ratios
Just like the simple statistics, but composed derived from them.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-24 20:14:28 -05:00
Glauber Costa
c094ba22c8 api/column family: bloom filter statistics
This patch uses the now existing infrastructure to expose statistics about the bloom
filters hit/miss rates.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-24 19:18:54 -05:00
Glauber Costa
2bfc2697c1 sstables: add method to grab filter size
This is one of the statistics we need to export

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-24 19:18:54 -05:00
Glauber Costa
7f04c1bf9b sstables: simplify filter tracker
The current filter tracker uses a distributed mechanism, even though the values
for all CPUs but one are usually at zero. This is because - I wrongly assumed -
that when using legacy sstables, the same sstable would be serving keys for
multiple shards, leading to the map reduce being a necessary operation in this
case.

However, Avi currently point out that:

"It is and it isn't [the case].  Yes the sstable will be loaded on multiple cores, but
each core will have its own independent sstable object (only the files on disk
are shared).

So to aggregate statistics on such a shared sstables, you have to match them by
name (and the sharded<filter_tracker> is useless)."

Avi is correct in his remarks. The code will hereby be simplified by keeping
local counters only, and the map reduce operation will happen at a higher
level.

Also, because the users of the get methods will go through the sstable, we can
actually just move them there. With that we can leave the counters private to
the external world in the filter itself.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-24 19:17:31 -05:00
Glauber Costa
df56020d58 filter: initialize all statistics
We are currently initializing some of the filter statistics. That can lead to
bogus values.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-24 17:10:57 -05:00
Raphael S. Carvalho
b2f76273bd tests: check correctness of sstable ancestor metadata
adding testcase for that purpose.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-08-24 15:25:52 -03:00
Raphael S. Carvalho
3ea6de4fc1 sstables: add method to get compaction metadata
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-08-24 15:06:24 -03:00
Avi Kivity
0e089d2a93 Merge 2015-08-24 19:16:42 +03:00
Gleb Natapov
bfa1ec31b9 drop explicit copying of captured parameters in mutation sending code
Now they are copied implicitly during call to messaging_service::send_mutation()
2015-08-24 19:16:37 +03:00
Avi Kivity
0617aecb62 lsa: downgrade "no compactible pool" warning to trace
It's a fairly standard condition.
2015-08-24 17:26:48 +02:00
Avi Kivity
4390be3956 Rename 'negative_mutation_reader' to 'partition_presence_checker'
Suggested by Tomek.
2015-08-24 18:03:22 +03:00
Raphael S. Carvalho
c65af6e188 api: add get_unleveled_sstables to column family api
Adding to API function to return count of sstables in L0 if leveled
compaction strategy is enabled, 0 otherwise. Currently, we don't
support leveled compaction strategy, so function to return count of
sstables in L0 always return zero.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-08-24 11:56:31 -03:00
Raphael S. Carvalho
41b6d430c0 compaction_manager: do not retry compaction if stopping task
If stopping a task, we shouldn't retry a compaction because if
removing a cf, we would push back the cf into the back of the
queue if an error happened, and that would possibly lead to a
use-after-free.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-08-24 11:23:24 -03:00
Raphael S. Carvalho
4c9c144987 compaction_manager: avoid concurrent compaction on the same cf
It was noticed that the same sstable files could be selected for
compaction if concurrent compaction happens on the same cf.
That's possible because compaction manager uses 2 tasks for
handling compactions.

Solution is to not duplicate cf in the compaction manager queue,
and re-schedule compaction for a cf if needed.

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-08-24 11:11:47 -03:00
Avi Kivity
76703cceab Merge seastar upstream
* seastar cfa8db0...6444208 (1):
  > file: don't be noisy during close-in-destructor
2015-08-24 16:06:44 +03:00
Avi Kivity
3c29f255fb Merge seastar upstream
* seastar e359379...cfa8db0 (6):
  > rpc: test rcp timeout
  > rpc: add timeout support
  > tests: add (another) allocator test
  > memory: try harder to allocate large blocks
  > memory: avoid allocation in realloc() if shrinking object size
  > Update dpdk submodule
2015-08-24 15:24:57 +03:00
Avi Kivity
e222d2a463 Merge "storage_service update: fix bootstrap state" from Asias
"The bootstrap state is now correct with cqlsh query."
2015-08-24 14:24:29 +03:00
Asias He
0d047e86d7 storage_service: Kill one FIXME in join_token_ring
string to token function is now available.
2015-08-24 18:54:42 +08:00
Asias He
ce0435c105 storage_service: Switch to use get0 instead of std::get
It is simpler.
2015-08-24 18:54:42 +08:00
Asias He
1deaef1cc5 storage_service: Use get() inside set_tokens
set_tokens always runs inside a seastar thread. Use get() instead
returning a future. Preparation for calling get_local_tokens.
2015-08-24 18:54:42 +08:00
Asias He
7c4703cebf storage_service: Enable get_local_tokens
Will be used in set_tokens.
2015-08-24 18:54:42 +08:00
Asias He
dfae224cb4 storage_service: Update license for storage_service.cc
This file contains code converted from Origin.
2015-08-24 18:54:42 +08:00
Asias He
cd1c902cf9 storage_service: Call boot_strapper::bootstrap
Now that boot_strapper::bootstrap is available, use it. It will set
_is_bootstrap_mode set to false, so now we can enable the
assert(!_is_bootstrap_mode) follows the call to bootstrap.
2015-08-24 18:54:42 +08:00
Asias He
22ee468428 db/system_keyspace: Fix set_bootstrap_state
We set status to COMPLETED in join_token_ring

   set_bootstrap_state(db::system_keyspace::bootstrap_state::COMPLETED)

but

   cqlsh 127.0.0.$i -e "SELECT * from system.local;"

shows

    bootstrapped -> IN_PROGRESS

The static sstring state_name is the bad boy.
2015-08-24 18:54:42 +08:00
Asias He
126fc5869c dht/boot_strapper: Move code to source file
get_bootstrap_tokens and get_random_tokens are moved.
2015-08-24 18:54:42 +08:00
Asias He
26861ddc29 dht/boot_strapper: Use unordered_set for tokens
unordered_set is used everywhere for tokens. This makes it is easier to
construct a boot_strapper object in storage_service::bootstrap where
unordered_set is used for tokens.
2015-08-24 18:54:42 +08:00
Asias He
2ebd08cb42 dht/boot_strapper: Partially implement bootstrap 2015-08-24 18:54:42 +08:00
Asias He
8ae0b6e875 storage_service: Fix call to set_bootstrap_state
It returns a future. We can not ignore it.
2015-08-24 18:54:42 +08:00
Asias He
52dcba1319 storage_service: Kill seastar thread usage in join_token_ring
join_token_ring is called in two places, one is
storage_service::init_server and the other is
storage_service::join_ring.

The former is already inside a seastar thread. The latter is not but it
is rarely called. We can make join_ring runs inside a seastar thread, so
that join_token_ring always runs inside a seastar thread and we can get
rid of creating a thread inside join_token_ring.
2015-08-24 18:54:42 +08:00
Avi Kivity
8a4648761c tests: make test cql environment use volatile system keyspace
Prevents hangs due to the database not being able to persist a memtable.

Tested-by: Asias He <asias@cloudius-systems.com>
2015-08-24 13:50:22 +03:00