Commit Graph

146 Commits

Author SHA1 Message Date
Avi Kivity
b0fd850463 Merge "Code missing for a NetworkTopologyStrategy integration" from Vlad
"This series add a code missing for an integration of a NetworkTopologyStrategy
with a current clustering WRITE path."
2015-07-02 17:31:49 +03:00
Avi Kivity
4ef3eaef4d Merge "Add tests for query interface on mutation level" from Tomasz 2015-07-02 16:35:27 +03:00
Vlad Zolotarov
89a7e84483 service: storage_proxy: implement datacenter_sync_write_response_handler
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-07-02 16:00:47 +03:00
Vlad Zolotarov
4a78d173f6 service: storage_proxy::send_to_live_endpoints(): properly query a datacenter name
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-07-02 16:00:47 +03:00
Vlad Zolotarov
acb8b9fcda service: storage_proxy::mutate(): properly query local_dc
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-07-02 16:00:47 +03:00
Gleb Natapov
c81cf80d8a fix mutation forwarding for muti-DC setup
Forwarding lambda is reused, so we cannot move captures out of it
and we cannot pass references to them either since lambda can be
destroyed before send completes.
2015-07-02 15:30:21 +03:00
Tomasz Grabiec
a1f6dec067 result_set: Introduce from_raw_result() factory method 2015-07-02 13:25:46 +02:00
Tomasz Grabiec
c9e5508e3c result_set_builder: Make build() return unwrapped object
It's better to let the user decide which kind (if any) of smart
pointer to wrap it into.
2015-07-02 13:25:46 +02:00
Gleb Natapov
4b9661c608 initial read clustering code
Works only if all replicas (participating in CL) has the same live
data. Does not detects mismatch in tombstones (no infrastructure yet).
Does not report timeout yet.
2015-07-01 13:36:30 +03:00
Gleb Natapov
97a4b0ee40 Store frozen_mutation in shared pointer while processing it
If local mutation write takes longer then write timeout mutation will
be deleted while it is processed by database engine. Fix this by storing
mutation in shared pointer and hold to the pointer until mutation is
locally processed.
2015-06-24 12:51:34 +03:00
Gleb Natapov
7d846e842c use write_request_timeout_in_ms for write request timeout
Fixes another fixme. Also change default value to 2000 which seams to
be what origin uses.
2015-06-24 12:51:33 +03:00
Gleb Natapov
12f3d53372 storage_proxy: cleanup leftovers from timer consolidation 2015-06-23 15:43:59 +03:00
Gleb Natapov
2be9dfc242 storage_proxy: use fb_utilities::get_broadcast_address()
fixes some fixmes
2015-06-23 15:43:59 +03:00
Gleb Natapov
67ea1b0ec8 Revert "db: hold onto write response handler until timeout handler is executed"
This reverts commit 52aa0a3f91.

After c9909dd183 this is no longer needed since reference to a
handler is not used in abstract_write_response_handler::wait() continuation.

Conflicts:
	service/storage_proxy.cc
2015-06-23 15:43:59 +03:00
Gleb Natapov
c9909dd183 cluster: consolidate mutation clustering timers
Currently mutation clustering uses two timers, one expires when wait for
cl timeouts and is canceled when cl is achieved, another expires if some
endpoints do not answer for a long time (cl may be already achieved at
this point and first timer will be canceled). This is too complicated
especially since both timers can expire simultaneously. Simplify it by
having only one timer and checking in a callback whether cl was achieved.
2015-06-23 14:42:56 +03:00
Asias He
5e1348e741 storage_service: Use get_local_snitch_ptr in gossip_snitch_info 2015-06-23 12:12:33 +03:00
Tomasz Grabiec
b8db713b81 service: Increase write timeout to 2 seconds
Current timeout is 100ms. cassandra-stress is failing for me often
because of this, with "Mutation write timeout" message.

The comment says that the timeout value is based on
DatabaseDescriptor.getWriteRpcTimeout(), which in Origin is equal to 2
seconds by default, so bump it up.

Code pointers:

DatabaseDescriptor:L844

    public static long getWriteRpcTimeout()
    {
        return conf.write_request_timeout_in_ms;
    }

Config:L74

  public volatile Long write_request_timeout_in_ms = 2000L;
2015-06-22 15:45:51 +03:00
Gleb Natapov
52aa0a3f91 db: hold onto write response handler until timeout handler is executed
If last response comes after write timeout is triggered, but before
continuation, that suppose to handle it runs the handler can be removed
to earlier and be access from the continuation after deletion. Fix it by
making response handler to be shared pointer instead of unique and
holding to it in timeout continuation.
2015-06-21 13:09:43 +03:00
Tomasz Grabiec
3779506990 db: query: Make partition_range hold ring_position
Current model was not really correct because Origin doesn't support
querying of partition ranges by their value. We can query slices
according to dht::decorated_key ordering, which orders partitions
first by token then by key value.

ring_position encapsulates range constraint. Key value is optional, in
which case only token is constrained.
2015-06-18 15:47:40 +02:00
Gleb Natapov
d8dcceea09 stop storage and messaging services during exit 2015-06-18 15:13:02 +03:00
Gleb Natapov
a338407e29 make storage_proxy object distributed
storage_proxy holds per cpu state now to track clustering, so it has to
be distributed otherwise smp setup does not work.
2015-06-17 15:14:06 +02:00
Pekka Enberg
825588ed48 storage_proxy: Make clustering range configurable for query_local()
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-06-17 12:25:18 +03:00
Asias He
2682d442aa storage_service: Add stop() needed by distributed<> 2015-06-16 15:45:15 +08:00
Asias He
7b6ab5aaa1 storage_service: Implement gossip_snitch_info
Now we have rack and datacenter info injected into gossip.

{ 3 : Value(rack1,9) }  { 4 : Value(datacenter1,11) }
2015-06-16 15:44:30 +08:00
Asias He
13f2292596 storage_service: Use fb_utilities::get_broadcast_address 2015-06-16 15:08:44 +08:00
Avi Kivity
743b6efd54 Merge "initial mutation clustering" from Gleb 2015-06-15 13:25:01 +03:00
Gleb Natapov
969134280a initial mutation clustering code 2015-06-15 12:53:10 +03:00
Gleb Natapov
c500823d35 move init_storage_service out of main.cc 2015-06-15 12:51:04 +03:00
Asias He
d2a9ea7ca6 storage_service: Fix Unable to contact any seeds
Sleep before do bootstrap. This code was not converted from Origin.

With this we can start multiple nodes simultaneously.

./build/release/seastar -c 1 -m 128M --rpc-address 127.0.0.1
--listen-address 127.0.0.1 --seed-provider-parameters 127.0.0.1
--datadir `pwd`/tmp/1 --commitlog-directory `pwd`/tmp/1 2>&1 | tee
/tmp/out1 &

./build/release/seastar -c 1 -m 128M --rpc-address 127.0.0.2
--listen-address 127.0.0.2 --seed-provider-parameters 127.0.0.1
--datadir `pwd`/tmp/2 --commitlog-directory `pwd`/tmp/2 2>&1 | tee
/tmp/out2 &

./build/release/seastar -c 1 -m 128M --rpc-address 127.0.0.3
--listen-address 127.0.0.3 --seed-provider-parameters 127.0.0.1
--datadir `pwd`/tmp/3 --commitlog-directory `pwd`/tmp/3 2>&1 | tee
/tmp/out3 &

./build/release/seastar -c 1 -m 128M --rpc-address 127.0.0.4
--listen-address 127.0.0.4 --seed-provider-parameters 127.0.0.1
--datadir `pwd`/tmp/4 --commitlog-directory `pwd`/tmp/4 2>&1 | tee
/tmp/out4
2015-06-15 12:30:02 +03:00
Gleb Natapov
fc6f6634fa support query of multiple singular ranges 2015-06-11 15:18:07 +03:00
Gleb Natapov
b7155ad862 pass partitions_ranges separately from from read_command
partitions_ranges will be manipulated upon to be split for different
destination, so provide it separately from read_command to not copy the
later for each destination.
2015-06-11 15:18:07 +03:00
Vlad Zolotarov
73278798a9 added missing methods (stubs) required for snitch implementation
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>

New in v2:
   - storage_service: add a non-const version of get_token_metadata().
   - get_broadcast_address(): check if net::get_messaging_service().local_is_initialized()
     before calling net::get_local_messaging_service().listen_address().
   - get_broadcast_address(): return an inet_address by value.
   - system_keyspace: introduce db::system_keyspace::endpoint_dc_rack
   - fb_utilities: use listen_address as broadcast_address for now
2015-06-09 15:33:29 +03:00
Pekka Enberg
56a790cdc4 service/storage_proxy: Fix query_local() to respect given key
We want query_local() to actually respect the key we pass to it. Fixes
an issue in keyspace merging code where we returned multiple rows for a
keyspace.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-06-05 13:15:12 +02:00
Asias He
1aac08b8ab Revert "storage_service: Remove ad-hoc token_metadata creation"
This reverts commit a19d2171eb.

This commit breaks cql_query_test.

   [asias@hjpc urchin]$ ./cql_query_test
   Running 1 test case...
   WARNING: Not implemented: COMPACT_TABLES
   WARNING: Not implemented: METRICS
   WARNING: Not implemented: PERMISSIONS
   cql_query_test: core/distributed.hh:290: Service&
   distributed<Service>::local() [with Service =
   service::storage_service]: Assertion `local_is_initialized()' failed.
   unknown location(0): fatal error in "test_create_keyspace_statement":
   signal: SIGABRT (application abort requested)
   tests/test-utils.cc(31): last checkpoint

   *** 1 failure detected in test suite "tests/urchin/cql_query_test.cc"
   (gdb) bt
   #0  0x00000032930348d7 in __GI_raise (sig=sig@entry=6) at
   ../sysdeps/unix/sysv/linux/raise.c:55
   #1  0x000000329303653a in __GI_abort () at abort.c:89
   #2  0x000000329302d47d in __assert_fail_base (fmt=0x3293186cb8
   "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
   assertion=assertion@entry=0x8ec10a "local_is_initialized()",
   file=file@entry=0x92508d "core/distributed.hh",
       line=line@entry=290, function=function@entry=0x8ed440
   <distributed<service::storage_service>::local()::__PRETTY_FUNCTION__>
   "Service& distributed<Service>::local() [with Service =
   service::storage_service]")
       at assert.c:92
   #3  0x000000329302d532 in __GI___assert_fail (assertion=0x8ec10a
   "local_is_initialized()", file=0x92508d "core/distributed.hh",
   line=290,
       function=0x8ed440
   <distributed<service::storage_service>::local()::__PRETTY_FUNCTION__>
   "Service& distributed<Service>::local() [with Service =
   service::storage_service]") at assert.c:101
   #4  0x0000000000430f19 in local (this=<optimized out>) at
   core/distributed.hh:290
   #5  get_local_storage_service () at service/storage_service.hh:3326
   #6  keyspace::create_replication_strategy (this=0x7ffff6bf8350) at
   database.cc:690
   #7  0x000000000061537a in
   _ZZZN2db20legacy_schema_tables15merge_keyspacesERN7service13storage_proxyEOSt3mapI13basic_sstringIcjLj15EE13lw_shared_ptrIN5query10result_setEESt4lessIS6_ESaISt4pairIKS6_SA_EEESI_ENKUlRT_E0_clISt6ve
   ctorISF_SG_EEEDaSK_ENKUlR8databaseE_clESQ_ () at
   db/legacy_schema_tables.cc:584
   #8  0x0000000000617d19 in operator() (__closure=0x7ffff6bf8650) at
   ./core/distributed.hh:284

In the test, storage_service and other services are not stared.

Let's revert it and figure out a way to run cql_query_test with the
needed services started properly and then bring the "storage_service:
Remove ad-hoc token_metadata creation" change back.
2015-06-05 08:21:59 +03:00
Asias He
77e8f361bb storage_service: Reduce time for non-seed node to join the ring
Waiting for 30 seconds is way too long for testing. Reduce it to 5
seconds.

When we have a proper config system, we can specify in cmdline.
2015-06-04 17:16:50 +08:00
Asias He
a19d2171eb storage_service: Remove ad-hoc token_metadata creation
Use token_metadata from storage_service when creating a
replication_strategy in keyspace::create_replication_strategy.
2015-06-04 17:16:50 +08:00
Asias He
f1ed0cdc7e storage_service: Start on all cpus and replicate _token_metadata
_token_metadata is needed by replication strategy code on all cpus.
Changes to _token_metadata are done on cpu 0. Replicate it to all cpus.

We may copy only if _token_metadata actually changes. As a starter, we
always copy in gossip modification callbacks.
2015-06-04 17:16:50 +08:00
Asias He
cae9d65e9d storage_service: Move more code to source file 2015-06-04 17:12:10 +08:00
Asias He
4311662828 storage_service: Implement update_peer_info 2015-06-04 17:12:10 +08:00
Asias He
a85cee6afe storage_service: Rename isSurveyMode to _is_survey_mode 2015-06-04 17:12:10 +08:00
Asias He
db527c1a81 storage_service: Move joinRing to source file 2015-06-04 17:12:10 +08:00
Asias He
4dc4e54e50 storage_service: Add is_joined 2015-06-04 17:12:10 +08:00
Asias He
ca2e151c03 storage_service: Rename initialized to _initialized 2015-06-04 17:12:10 +08:00
Asias He
e5c653939b storage_service: Add is_bootstrap_mode and finish_bootstrapping 2015-06-04 17:12:10 +08:00
Asias He
c87f950aff storage_service: Implement handle_state_normal
Start two nodes, after bootstrap, uuid and token are spread correctly
through gossip:

----------- endpoint_state_map:  -----------
ep=127.0.0.1, eps=EndpointState: HeartBeatState = generation =
1433172216, version = 66, AppStateMap =  { 0 : Value(NORMAL,TOKENS,11) }
{ 5 : Value(urchin_1_0,4) }  { 8 : Value(,3) }  { 11 : Value(ms_1_0,1) }
{ 12 : Value(06eb49d2-a092-483a-a89a-f774cff2c3e5,2) }  { 13 :
Value(0b20137e213f697b;c39a029ad9dd2948;0003be0eeb569d5a,9) }

ep=127.0.0.2, eps=EndpointState: HeartBeatState = generation =
1433172229, version = 56, AppStateMap =  { 0 : Value(NORMAL,TOKENS,51) }
{ 5 : Value(urchin_1_0,4) }  { 8 : Value(,3) }  { 11 : Value(ms_1_0,1) }
{ 12 : Value(adc8eb9f-7c1f-4695-905c-c1c4fdeea4d8,2) }  { 13 :
Value(6f5607a9b4cbadf0;eb7d976656cafad1;a225d312b9f42e5b,50) }

----------- token_metadata:  -----------
Endpoint -> Token
inet_address=127.0.0.2, token=a2 25 d3 12 b9 f4 2e 5b
inet_address=127.0.0.1, token=c3 9a 02 9a d9 dd 29 48
inet_address=127.0.0.2, token=eb 7d 97 66 56 ca fa d1
inet_address=127.0.0.1, token=00 03 be 0e eb 56 9d 5a
inet_address=127.0.0.1, token=0b 20 13 7e 21 3f 69 7b
inet_address=127.0.0.2, token=6f 56 07 a9 b4 cb ad f0
Endpoint -> UUID
inet_address=127.0.0.1, uuid=06eb49d2-a092-483a-a89a-f774cff2c3e5
inet_address=127.0.0.2, uuid=adc8eb9f-7c1f-4695-905c-c1c4fdeea4d8
2015-06-04 17:12:10 +08:00
Asias He
1ed5d01cd2 storage_service: Fix STATUS in set_tokens
Here, we should set STATUS to NORMAL.
2015-06-04 17:12:10 +08:00
Asias He
68f671a8b7 storage_service: Move gossip callback to source file 2015-06-04 17:12:09 +08:00
Asias He
6917a904c3 storage_service: Implement handle_state_bootstrap 2015-06-04 17:12:09 +08:00
Asias He
9dc7a60b4a storage_service: Move handle_state_bootstrap and friends to source file 2015-06-04 17:12:09 +08:00
Asias He
9c5cd2bca8 storage_service: Switch to use unordered_set for tokens
We do not care about the order of the tokens.

Also, in token_metadata, we use unordered_set for tokens as well, e.g.
update_normal_tokens. Unify the usage.
2015-06-04 17:12:09 +08:00