Commit Graph

116 Commits

Author SHA1 Message Date
Asias He
abafec99a5 system_keyspace: Implement increment_and_get_generation 2016-02-29 16:31:42 +08:00
Tomasz Grabiec
6709c0ac15 cql_serialization_format: Make it CQL protocol version aware
We want to serialize it as a single number, the CQL binary protocol
version to which it corresponds, so it needs to be aware of the
version number.
2016-02-15 17:05:55 +01:00
Calle Wilund
ce66acc771 system_keyspace: Always retain highest truncation time stamp
Since the table is written from all shards, and we possibly might
have conflicting time stamps, we define the trucated_at time
as the highest time point. I.e. conservative.
2016-02-09 15:45:37 +00:00
Calle Wilund
1c213e1f38 system_keyspace: Use IDL types + better verification of truncation record
Truncation records are not portable between us and Origin.
We need to detect and ensure we neither try to use, and more to the
point, don't crash because of data format error when loading, origin
records from a migrated system.

This problem was seen by Tzach when doing a migration from an origin
setup.

Updated record storage to use IDL-serialized types + added versioning
and magic marking + odd-size-checking to ensure we load only correct
data. The code will also deal with records from an older version of
scylla.
2016-02-09 15:45:37 +00:00
Tomasz Grabiec
4e5a52d6fa db: Make read interface schema version aware
The intent is to make data returned by queries always conform to a
single schema version, which is requested by the client. For CQL
queries, for example, we want to use the same schema which was used to
compile the query. The other node expects to receive data conforming
to the requested schema.

Interface on shard level accepts schema_ptr, across nodes we use
table_schema_version UUID. To transfer schema_ptr across shards, we
use global_schema_ptr.

Because schema is identified with UUID across nodes, requestors must
be prepared for being queried for the definition of the schema. They
must hold a live schema_ptr around the request. This guarantees that
schema_registry will always know about the requested version. This is
not an issue because for queries the requestor needs to hold on to the
schema anyway to be able to interpret the results. But care must be
taken to always use the same schema version for making the request and
parsing the results.

Schema requesting across nodes is currently stubbed (throws runtime
exception).
2016-01-11 10:34:52 +01:00
Tomasz Grabiec
04eb58159a query: Add schema_version field to read_command 2016-01-11 10:34:51 +01:00
Tomasz Grabiec
f58c2dec1e schema: Make schema objects versioned
The version needs to change value not only on structural changes but
also temporal. This is needed for nodes to detect if the version they
see was already synchronized with or not even if it has the same
structure as the past versions. We also need to end up with the same
version on all nodes when schema changes are commuted.

For regular mutable schemas version will be calculated from underlying
mutations when schema is announced. For static schemas of system
keyspace it is calculated by hashing scylla version and column id,
because we don't have mutations at the time of building the schema.
2016-01-08 21:10:26 +01:00
Asias He
4952042fbf tests: Fix cql_test_env.cc
Current service initialization is a total mess in cql_test_env. Start
the service the same order as in main.cc.

Fixes #715, #716

'./test.py --mode release' passes.
2016-01-01 10:15:17 +08:00
Raphael S. Carvalho
433ed60ca3 db: add method to get compaction history
This method is intended to return content of the system table
COMPACTION_HISTORY as a vector of compaction_history_entry.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2015-12-14 14:19:04 -02:00
Raphael S. Carvalho
f3beacac28 db: add method to update the system table COMPACTION_HISTORY
It's supposed to be called at the end of compaction.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2015-12-14 13:47:10 -02:00
Asias He
e79c85964f system_keyspace: Flush system.peers in remove_endpoint
1) Start node 1, node 2, node 3
2) Stop  node 3
3) Start node 4 to replace node 3
4) Kill  node 4 (removal of node 3 in system.peers is not flushed to disk)
5) Start node 4 (will load node 3's token and host_id info in bootup)

This makes

   "Token .* changing ownership from 127.0.0.3 to 127.0.0.4"

messages printed again in step 5) which are not expected, which fails the dtest

   FAIL: replace_first_boot_test (replace_address_test.TestReplaceAddress)
   ----------------------------------------------------------------------
   Traceback (most recent call last):
     File "scylla-dtest/replace_address_test.py",
   line 220, in replace_first_boot_test
       self.assertEqual(len(movedTokensList), numNodes)
   AssertionError: 512 != 256
2015-12-09 12:30:52 +08:00
Asias He
ccbd801f40 storage_service: Fix decommissioned nodes are willing to rejoin the cluster if restarted
Backport: CASSANDRA-8801

a53a6ce Decommissioned nodes will not rejoin the cluster.

Tested with:
topology_test.py:TestTopology.decommissioned_node_cant_rejoin_test
2015-12-09 10:43:51 +08:00
Avi Kivity
47499dcf18 data_value: make conversion from bytes explicit
Since bytes is a very generic value that is returned from many calls,
it is easy to pass it by mistake to a function expecting a data_value,
and to get a wrong result.  It is impossible for the data_value constructor
to know if the argument is a genuine bytes variable, a data_value of another
type, but serialized, or some other serialized data type.

To prevent misuse, make the data_value(bytes) constructor
(and complementary data_value(optional<bytes>) explicit.
2015-11-13 17:12:29 +02:00
Tomasz Grabiec
3c4c83c66f cql_test_env: Initialize system keyspace 2015-11-09 08:42:53 +08:00
Avi Kivity
2c3591cbd9 data_value de-any-fication
We use boost::any to convert to and from database values (stored in
serlialized form) and native C++ values.  boost::any captures information
about the data type (how to copy/move/delete etc.) and stores it inside
the boost::any instance.  We later retrieve the real value using
boost::any_cast.

However, data_value (which has a boost::any member) already has type
information as a data_type instance.  By teaching data_type intances about
the corresponding native type, we can elimiante the use of boost::any.

While boost::any is evil and eliminating it improves efficiency somewhat,
the real goal is growing native type support in data_type.  We will use that
later to store native types in the cache, enabling O(log n) access to
collections, O(1) access to tuples, and more efficient large blob support.
2015-10-30 17:38:51 +01:00
Vlad Zolotarov
d8de1099eb message::messaging_service: introduce _preferred_ip_cache
This map will contain the (internal) IPs corresponding to specific Nodes.
The mapping is also stored in the system.peers table.

So, instead of always connecting to external IP messaging_service::get_rpc_client()
will query _preferred_ip_cache and only if there is no entry for a given
Node will connect to the external IP.

We will call for init_local_preferred_ip_cache() at the end of system table init.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>

New in v2:
   - Improved the _preferred_ip_cache description.
   - Code styling issues.

New in v3:
   - Make get_internal_ip() public.
   - get_rpc_client(): return a get_preferred_ip() usage dropped
     in v2 by mistake during rebase.
2015-10-26 14:09:26 +02:00
Vlad Zolotarov
fd811dd707 db::system_keyspace: added get_preferred_ips()
get_preferred_ips() returns all preferred_ip's stored in system.peers
table.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>

New in v2:
   - Get rid of extra std::move().
2015-10-26 14:09:26 +02:00
Vlad Zolotarov
f2e1be0fc1 db::system_keyspace::update_preferred_ip(): use net::ipv4_address as a preferred_ip value
Fixes issue #481

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-10-26 14:09:26 +02:00
Calle Wilund
6b0ab79ecb system_keyspace: Keep per-shard truncation records
Fixes  #423
* CF ID now maps to a truncation record comprised of a set of 
  per-shard RP:s and a high-mark timestamp
* Retrieving RP:s are done in "bulk"
* Truncation time is calculated as max of all shards.

This version of the patch will accept "old" truncation data, though the 
result of applying it will most likely not be correct (just one shard)

Record is still kept as a blob, "new" format is indicated by 
record size.
2015-10-07 08:59:52 +02:00
Calle Wilund
b3c95ce42d system_keyspace: Change truncation record method to use context qp
Align with rest of file (for better or worse). This allows calls from
entity without query_processor handy (i.e. storage_proxy).

Added "minimal" setup method for the "global" state, to facilitate
tests. Doing a full setup either in cql_test_env or after it is created
breaks badly. (Not sure why). So quick workaround.

Updated the current two users (batchlog_manager and commitlog_replayer)
callsites to conform.
2015-09-30 09:09:41 +02:00
Avi Kivity
d5cf0fb2b1 Add license notices 2015-09-20 10:43:39 +03:00
Gleb Natapov
df468504b6 schema_table: convert code to use distributed<storage_proxy> instead of storage_proxy&
All database code was converted to is when storage_proxy was made
distributed, but then new code was written to use storage_proxy& again.
Passing distributed<> object is safer since it can be passed between
shards safely. There was a patch to fix one such case yesterday, I found
one more while converting.
2015-09-09 10:19:30 +03:00
Glauber Costa
28f315fad4 system_keyspace: keep msg alive when needed
Fixes #266

Some callsites are fine: if we just get the message and process it, as is the
case with check_health for instance, msg will be alive and all is good. But if
we return a future inside the processing, msg must be kept alive. Classic bug,
appearing again.

Pekka saw this in practice in another bug. We haven't seen anything that is
related to this, but it is certainly wrong.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-09-03 09:11:07 +03:00
Pekka Enberg
ce39f9d57a db/system_keyspace: Fix use-after-free in build_dc_rack_info()
Fixes #264.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-09-02 16:37:34 +03:00
Asias He
80c996a315 db/system_keyspace: Fix get_local_host_id
Before:
host_id in system.local is empty

After:
host_id in system.local is inserted correctly

This fixes a hasty problem that we always get a new host_id when
booting up a node with data.
2015-08-27 11:01:07 +03:00
Glauber Costa
391eea564e system_tables: implement load_host_id
A simple translation from the original code.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-25 19:16:30 -05:00
Glauber Costa
0fd2861293 system_tables: implement load_tokens
A simple translation from the original code

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-25 19:16:30 -05:00
Asias He
22ee468428 db/system_keyspace: Fix set_bootstrap_state
We set status to COMPLETED in join_token_ring

   set_bootstrap_state(db::system_keyspace::bootstrap_state::COMPLETED)

but

   cqlsh 127.0.0.$i -e "SELECT * from system.local;"

shows

    bootstrapped -> IN_PROGRESS

The static sstring state_name is the bad boy.
2015-08-24 18:54:42 +08:00
Avi Kivity
8a4648761c tests: make test cql environment use volatile system keyspace
Prevents hangs due to the database not being able to persist a memtable.

Tested-by: Asias He <asias@cloudius-systems.com>
2015-08-24 13:50:22 +03:00
Asias He
67953a65b6 db/system_keyspace: Stub load_host_ids 2015-08-18 17:06:03 +08:00
Asias He
ab40ab6c19 db/system_keyspace: Stub load_tokens 2015-08-18 17:06:02 +08:00
Asias He
7f98a89968 db/system_keyspace: Introduce init_local_cache 2015-08-18 17:06:02 +08:00
Glauber Costa
0177c7fed1 system keyspace: implement get_bootstrap_state
To avoid spreading the futures all over, we will resort to a cache with this,
the same way we did for the dc/rack information.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-17 11:03:37 -07:00
Glauber Costa
20590db87f system keyspace: implement set_bootstrap_state
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-17 11:03:37 -07:00
Glauber Costa
8a50534119 system keyspace: implement get_saved_tokens
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-17 11:03:37 -07:00
Glauber Costa
6a682d0e49 storage_service: futurize get_tokens
Because all its users are already futurized, this is actually an easy one.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-17 11:03:37 -07:00
Glauber Costa
bebb2abe4b system keyspace: factor out local_cache start code
It will now be used for other values as well.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-17 11:03:36 -07:00
Asias He
ce927105d8 db/system_keyspace: Implement update_local_tokens 2015-08-12 07:50:26 +08:00
Asias He
95dd307597 db/system_keyspace: Remove duplicated commented out code
I'm not sure what happened. We have the same commented code in both .hh
and .cc. It is very confusing when enabling some of the code. Let's
remove the duplicated code in .cc and leave the in .hh only.
2015-08-12 07:50:26 +08:00
Asias He
96fe749141 db/system_keyspace: Stub get_bootstrap_state and friends 2015-08-12 07:50:26 +08:00
Asias He
5cb5050ca1 system_keyspace: Stub get_saved_tokens 2015-08-12 07:50:26 +08:00
Glauber Costa
0237a73e05 system_keyspace: make collections multi-cell
They are multi-cell in Origin. This has nothing to do with 2.2 vs 2.1,
and it is just a plain bug.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 09:30:54 -05:00
Glauber Costa
21ebaeffae schema_builder: provide a build function that doesn't take compact storage.
We will invoke the schema builder from schema_tables.cc, and at that point, the
information about compact storage no longer exists anywhere. If we just call it
like this, it will be the same as calling it with compact_storage::no, which
will trigger a (wrong) recomputation for compact_storage::yes CFs

The best way to solve that, is make the compact_storage parameter mandatory
every time we create a new table - instead of defaulting to no. This will
ensure that the correct dense and compound calculation are always done when
calling the builder with a parameter, and not done at all when we call it
without a parameter.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 09:30:54 -05:00
Glauber Costa
c19819290a system.size_estimates: define schema
This table exists in 2.1.8, and although it is dropped in 2.2, we
should at least list its schema.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 08:31:56 -05:00
Glauber Costa
28c0498bb6 system.local: add more fields
2.1.8 tables have 3 more fields in their system tables, that 2.2 don't.
Since we aim at 2.1 compatibility, we have to include them.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 08:31:56 -05:00
Pekka Enberg
99a80050e3 db: Rename legacy_schema_tables to schema_tables
There's nothing legacy about it so rename legacy_schema_tables to
schema_tables. The naming comes from a Cassandra 3.x development branch
which is not relevant for us in the near future.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-05 13:56:47 +03:00
Asias He
d6c31a2668 system_keyspace: Fix more execute_cql using inet_address
We should pass inet_address.addr().

With this, tokens in system.peers are updated correctly.

(1 rows)
cqlsh> SELECT tokens from system.peers;

 tokens
------------------------------------------------------------------------
 {'-5463187748725106974', '8051017138680641610', '8833112506891013468'}

(1 rows)
2015-08-05 15:58:55 +08:00
Asias He
1b7b199bdf system_keyspace: Fix remove_endpoint
I got this error If I pass inet_address to it.

boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::bad_any_cast>
> (boost::bad_any_cast: failed conversion using boost::any_cast)
2015-08-05 15:29:32 +08:00
Shlomi Livne
199f4d2545 Add enable-in-memory-data-store,enable-commitlog,enable-cache config
Abillity to enable/disable specific sub-modules - this settings do not
affect system tables which are allways persisted,cached and written to
commitlog

enable-in-memory-data-store marks if tables will be written/read to/from
disk
enable-commitllog marks if tables will be written to commitlog
enable-cache marks if tables will be written/read to/from cache

Please note in-memory-data-store does not change the read path so "old"
sstables are still read and cache may be used to cache their data

Signed-off-by: Shlomi Livne <shlomi@cloudius-systems.com>
2015-08-02 17:19:30 +03:00
Nadav Har'El
280c450892 Fix compilation
The at_exit() callback needs to return a future. In one place we forgot,
and now that at_exit() takes an std::function<>, this is verified at
compilation time and fails compilation.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
2015-07-28 10:28:08 +02:00