Commit Graph

162 Commits

Author SHA1 Message Date
Avi Kivity
d77e044cde db: convert sprint() to format()
sprint() recently became more strict, throwing on sprint("%s", 5). Replace
with the more modern format().

Mechanically converted with https://github.com/avikivity/unsprint.
2018-11-01 13:16:17 +00:00
Duarte Nunes
e46ef6723b Merge seastar upstream
* seastar d152f2d...c1e0e5d (6):
  > scripts: perftune.py: properly merge parameters from the command line and the configuration file
  > fmt: update to 5.2.1
  > io_queue: only increment statistics when request is admitted
  > Adds `read_first_line.cc` and `read_first_line.hh` to CMake.
  > fstream: remove default extent allocation hint
  > core/semaphore: Change the access of semaphore_units main ctor

Due to a compile-time fight between fmt and boost::multiprecision, a
lexical_cast was added to mediate.

sprint("%s", var) no longer accepts numeric values, so some sprint()s were
converted to format() calls. Since more may be lurking we'll need to remove
all sprint() calls.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2018-10-25 12:53:30 +03:00
Duarte Nunes
40a30d4129 db/schema_tables: Diff tables using ID instead of name
Currently we diff schemas based on table/view name, and if the names
match, then we detect altered schemas by comparing the schema
mutations. This fails to detect transitions which involve dropping and
recreating a schema with the same name, if a node receives these
notifications simultaneously (for example, if the node was temporarily
down or partitioned).

Note that because the ID is persisted and created when executing a
create_table_statement, then even if a schema is re-created with the
exact same structure as before, we will still considered it altered
because the mutations will differ.

This also stops schema pulling from working, since it relies on schema
merging.

The solution is to diff schemas using their ID, and not their name.

Keyspaces and user types are also susceptible to this, but in their
case it's fine: these are values with no identity, and are just
metadata. Dropping and recreating a keyspace can be views as dropping
all tables from the keyspace, altering it, and eventually adding new
tables to the keyspace.

Note that this solution doesn't apply to tables dropped and created
with the same ID (using the `WITH ID = {}` syntax). For that, we would
need to detect deltas instead of applying changes and then reading the
new state to find differences. However, this solution is enough,
because tables are usually created with ID = {} for very specific,
peculiar reasons. The original motivation meant for the new table to
be treated exactly as the old, so the current behavior is in fact the
desired one.

Tests: unit(release), dtests(schema_test, schema_management_test)

Fixes #3797

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20181001230932.47153-2-duarte@scylladb.com>
2018-10-02 20:15:46 +02:00
Duarte Nunes
e404f09a23 db/schema_tables: Drop tables before creating new ones
Doing it by the inverse order doesn't support dropping and creating a
schema with the same name.

Refs #3797

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20181001230932.47153-1-duarte@scylladb.com>
2018-10-02 20:15:32 +02:00
Nadav Har'El
36a657fc10 schema: persist "view virtual" columns to a separate system table
In the previous patch, we added a "view virtual" flag on columns. In this
patch we add persistance to this flag: I.e., writing it to the on-disk
schema table and reading it back on startup. But the implementation is
not as simple as adding a flag:

In the on-disk system tables, we have a "columns" table listing all the
columns in the database and their types. Cqlsh's "DESCRIBE MATERIALIZED
VIEW" works by reading this "columns" table, and listing all of the
requested view's columns. Therefore, we cannot add "virtual columns" -
which are columns not added by the user and not intended to be seen -
to this list.

We therefore need to create in this patch a separate list for virtual
columns, in a new table "view_virtual_columns". This table is essentially
identical to the existing "columns" table, just separate. We need to write
each column to the appropriate table (columns with the view_virtual flag to
"view_virtual_columns", columns without it to the old "columns"), read
from both on startup, and remember to delete columns from both when a table
is dropped.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2018-08-16 15:30:06 +03:00
Paweł Dziepak
0ea6d14cf5 atomic_cell: explicitly state when atomic_cell is a collection member
Collections are not going to be fully converted to the IMR just yet and
still use the old serialisation format. This means that they still don't
support fragmented values very well. This patch passes the information
when an atomic_cell is created as a member of a collection so that later
we can avoid fragmenting the value in such cases.
2018-05-31 15:51:11 +01:00
Paweł Dziepak
e9d6fc48ac treewide: require type for creating atomic_cell 2018-05-31 15:51:11 +01:00
Paweł Dziepak
93130e80fb atomic_cell: require column_definition for creating atomic_cell views 2018-05-31 15:51:11 +01:00
Tomasz Grabiec
b1465291cf db: schema_tables: Treat drop of scylla_tables.version as an alter
After upgrade from 1.7 to 2.0, nodes will record a per-table schema
version which matches that on 1.7 to support the rolling upgrade. Any
later schema change (after the upgrade is done) will drop this record
from affected tables so that the per-table schema version is
recalculated. If nodes perform a schema pull (they detect schema
mismatch), then the merge will affect all tables and will wipe the
per-table schema version record from all tables, even if their schema
did not change. If then only some nodes get restarted, the restarted
nodes will load tables with the new (recalculated) per-table schema
version, while not restarted nodes will still use the 1.7 per-table
schema version. Until all nodes are restarted, writes or reads between
nodes from different groups will involve a needless exchange of schema
definition.

This will manifest in logs with repeated messages indicating schema
merge with no effect, triggered by writes:

  database - Schema version changed to 85ab46cd-771d-36c9-bc37-db6d61bfa31f
  database - Schema version changed to 85ab46cd-771d-36c9-bc37-db6d61bfa31f
  database - Schema version changed to 85ab46cd-771d-36c9-bc37-db6d61bfa31f

The sync will be performed if the receiving shard forgets the foreign
version, which happens if it doesn't process any request referencing
it for more than 1 second.

This may impact latency of writes and reads.

The fix is to treat schema changes which drop the 1.7 per-table schema
version marker as an alter, which will switch in-memory data
structures to use the new per-table schema version immediately,
without the need for a restart.

Fixes #3394

Tests:
    - dtest: schema_test.py, schema_management_test.py
    - reproduced and validated the fix with run_upgrade_tests.sh from git@github.com:tgrabiec/scylla-dtest.git
    - unit (release)

Message-Id: <1524764211-12868-1-git-send-email-tgrabiec@scylladb.com>
2018-04-27 17:12:33 +03:00
Calle Wilund
97f9f572f8 schema_tables: Load/save extensions table
Parses the extension map in tables/views using the registered extension.
If a schema row contains an unknown extension, we just preserve the data
in a placeholder.
2018-02-07 10:11:46 +00:00
Calle Wilund
2b56bbfa7d schema_tables: Require context object in schema load path
Requires "workaround" fix for schema_registry and frozen_mutation, since
the former is a free-float thread local, and the latter is a pure data
carrier. frozen_schema can take a parameter for unfreeze, but schema
registry requires being told which the system extensions are.
2018-02-07 10:11:46 +00:00
Calle Wilund
c2b49ec2e2 schema_tables: Add opaque context object
To allow carrying extensions and potentially more
2018-02-07 10:11:46 +00:00
Duarte Nunes
1e3fae5bef db/schema_tables: Only drop UDTs after merging tables
Dropping a user type requires that all tables using that type also be
dropped. However, a type may appear to be dropped at the same time as
a table, for instance due to the order in which a node receives schema
notifications, or when dropping a keyspace.

When dropping a table, if we build a schema in a shard through a
global_schema_pointer, then we'll check for the existence of any user
type the schema employs. We thus need to ensure types are only dropped
after tables, similarly to how it's done for keyspaces.

Fixes #3068

Tests: unit-tests (release)

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180129114137.85149-1-duarte@scylladb.com>
2018-01-30 12:07:04 +01:00
José Guilherme Vanz
380bc0aa0d Swap arguments order of mutation constructor
Swap arguments in the mutation constructor keeping the same standard
from the constructor variants. Refs #3084

Signed-off-by: José Guilherme Vanz <guilherme.sft@gmail.com>
Message-Id: <20180120000154.3823-1-guilherme.sft@gmail.com>
2018-01-21 12:58:42 +02:00
Glauber Costa
08a0c3714c allow request-specific read timeouts in storage proxy reads
Timeouts are a global property. However, for tables in keyspaces like
the system keyspace, we don't want to uphold that timeout--in fact, we
wan't no timeout there at all.

We already apply such configuration for requests waiting in the queued
sstable queue: system keyspace requests won't be removed. However, the
storage proxy will insert its own timeouts in those requests, causing
them to fail.

This patch changes the storage proxy read layer so that the timeout is
applied based on the column family configuration, which is in turn
inherited from the keyspace configuration. This matches our usual
way of passing db parameters down.

In terms of implementation, we can either move the timeout inside the
abstract read executor or keep it external. The former is a bit cleaner,
the the latter has the nice property that all executors generated will
share the exact same timeout point. In this patch, we chose the latter.

We are also careful to propagate the timeout information to the replica.
So even if we are talking about the local replica, when we add the
request to the concurrency queue, we will do it in accordance with the
timeout specified by the storage proxy layer.

After this patch, Scylla is able to start just fine with very low
timeouts--since read timeouts in the system keyspace are now ignored.

Fixes #2462

Implementation notes, and general comments about open discussion in 2462:

* Because we are not bypassing the timeout, just setting it high enough,
  I consider the concerns about the batchlog moot: if we fail for any
  other reason that will be propagated. Last case, because the timeout
  is per-CF, we could do what we do for the dirty memory manager and
  move the batchlog alone to use a different timeout setting.

* Storage proxy likes specifying its timeouts as a time_point, whereas
  when we get low enough as to deal with the read_concurrency_config,
  we are talking about deltas. So at some point we need to convert time_points
  to durations. We do that in the database query functions.

v2:
- use per-request instead of per-table timeouts.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2018-01-12 07:43:21 -05:00
Paweł Dziepak
4dfddc97c7 db/schema_tables: do not use moved from shared pointer
Shared pointer view is captured by two continuations, one of which is
moving it away. Using do_with() solves the problem.

Fixes #3092.
Message-Id: <20171221111614.16208-1-pdziepak@scylladb.com>
2017-12-21 15:13:25 +01:00
Pekka Enberg
0c192c835c cql3: Fix 'DROP INDEX' to also drop index view
This patch fixes 'DROP INDEX' CQL statement to also drop the underlying
index view automatically so that we don't leave unused materialized
views behind.
Message-Id: <1510303421-15945-1-git-send-email-penberg@scylladb.com>
2017-11-10 10:52:08 +01:00
Duarte Nunes
baeec0935f Replace query::full_slice with schema::full_slice()
query::full_slice doesn't select any regular or static columns, which
is at odds with the expectations of its users. This patch replaces it
with the schema::full_slice() version.

Refs #2885

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1507732800-9448-2-git-send-email-duarte@scylladb.com>
2017-10-17 11:25:53 +02:00
Duarte Nunes
a011eb72c2 Merge branch 'CQL secondary index backing views' from Pekka
"This patch series adds backing materialized view for secondary indices.
When a new index is created with the 'CREATE INDEX' statement, a backing
materialized view is created automatically.

For example, assuming the following table:

  CREATE TABLE ks1.users (
    userid uuid,
    email text,
    PRIMARY KEY (userid)
  );

When the following index is created:

  CREATE INDEX user_email ON ks1.users (email);

The following materialized view is also created:

  cqlsh> DESCRIBE ks1.users;

  <snip>

  CREATE MATERIALIZED VIEW ks1.user_email_index AS
      SELECT email, userid
      FROM ks1.users
      WHERE email IS NOT NULL
      PRIMARY KEY (email, userid)
      WITH CLUSTERING ORDER BY (userid ASC)
      AND bloom_filter_fp_chance = 0.01
      AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
      AND comment = ''
      AND compaction = {'class': 'SizeTieredCompactionStrategy'}
      AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
      AND crc_check_chance = 1.0
      AND dclocal_read_repair_chance = 0.1
      AND default_time_to_live = 0
      AND gc_grace_seconds = 864000
      AND max_index_interval = 2048
      AND memtable_flush_period_in_ms = 0
      AND min_index_interval = 128
      AND read_repair_chance = 0.0
      AND speculative_retry = '99.0PERCENTILE';

CQL queries will use the backing materialized view as part of queries on
indexed columns to fetch the primary keys."

* 'penberg/cql-2i-backing-view/v3' of github.com:scylladb/seastar-dev:
  schema_tables: Create backing view for indices
  database: Kill obsolete secondary index manager stub
  cql3: Wire up secondary index manager
  cql3/restrictions: Add term_slice::is_supported_by() function
  index: Add secondary_index_manager::create_view_for_index()
  index: Add target_parser::parse() helper
  cql3/statements: Add index_target::from_sstring() helper
  index: Add secondary_index_manager::get_dependent_indices()
  index: Add secondary_index_manager::reload()
  index: Add secondary_index_manager::list_indexes()
  index: Add index class
  index: Pass column_family to secondary_index_manager constructor
  database: Make secondary index manager per-column family
2017-10-05 12:08:14 +01:00
Pekka Enberg
4045e1ec09 schema_tables: Create backing view for indices
This patch wires calls to secondary index manager reload() in
merge_tables_and_views() and changes make_update_indices_mutations() to
also create mutations for the backing materialized view. After this
patch, "CREATE INDEX" CQL statement also creates a materialized view.
2017-10-05 10:07:44 +03:00
Tomasz Grabiec
571cac95ed schema_tables: Make make_scylla_tables_mutation() visible
For tests.
2017-09-14 20:26:31 +02:00
Tomasz Grabiec
f943d2efbf schema_tables: Don't alter tables which differ only in version
We apply deletion of scylla_tables.version to the incoming schema
mutations so that table schema version is recalculated after merge.
The mutations which we read from local schema tables may not have it
deleted in which case all tables would be considered as differing on
the presence of the version field. Avoid this by deleting the field
from old mutations as well.
2017-09-14 20:26:31 +02:00
Tomasz Grabiec
99272087e6 schema_mutations: Use mutation_opt instead of stdx::optional<mutation> 2017-09-14 20:26:31 +02:00
Botond Dénes
a980ff6463 Use abort() instead of assert + throw in unreachable code
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <393c3730111dfe090c44d8fc2e31602956a7d008.1504022425.git.bdenes@scylladb.com>
2017-09-03 11:07:27 +03:00
Botond Dénes
d1209c548a Fix -Wreturn-type warnings
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <99f7a006daaa78eb87720ac51c394093398bc868.1504013915.git.bdenes@scylladb.com>
2017-08-29 16:41:09 +03:00
Duarte Nunes
50ad0003c6 db/schema_tables: Drop dropped columns when dropping tables
Fixes #2633

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170726150228.2593-2-duarte@scylladb.com>
2017-07-26 18:41:28 +02:00
Duarte Nunes
3425403126 db/schema_tables: Store column_name in text form
As does Cassandra.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170726150228.2593-1-duarte@scylladb.com>
2017-07-26 18:41:12 +02:00
Duarte Nunes
33e18a1779 db/schema_tables: Consider differing dropped columns
If a node is notified of a schema change where the schema's dropped
columns have changes, that node will miss the changes to the dropped
columns. A scenario where this can happen is where a column c is
dropped, then added as a different typed, and then dropped again, with
a node n having seen the first drop and being notified of the
subsequent add and drop.

Fixes #2616

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170725170622.4380-1-duarte@scylladb.com>
2017-07-26 11:59:34 +02:00
Duarte Nunes
937fe80a1a Merge 'Fix possible inconsistency of table schema version' from Tomasz
"Fixes issues uncovered in longevity test (#2608).

Main problem is that due to time drift scylla_tables.version column
may not get deleted on all nodes doing the schema merge, which will
make some nodes come up with different table schema version than others.

The inconsistency will not heal because scylla_tables doesn't
take part in the schema sync. This is fixed by the last patch.

This will cause nodes to constantly try to sync the schema, which under
some conditions triggers #2617."

* tag 'tgrabiec/fix-table-schema-version-inconsistency-v1' of github.com:scylladb/seastar-dev:
  schema_tables: Add scylla_tables to ALL
  schema: Make schema_mutations equality consistent with digest
  schema_tables: Extract compact_for_schema_digest()
  schema_tables: Always drop scylla_tables::version
2017-07-21 16:55:23 +02:00
Duarte Nunes
7eecda3a61 schema: Support compaction enabled attribute
Fixes #2547

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170721132206.3037-1-duarte@scylladb.com>
2017-07-21 15:38:45 +02:00
Tomasz Grabiec
ed2388da2c schema_tables: Add scylla_tables to ALL
So that scylla_tables takes part in the digest and in mutations sent
as part of schema sync. Otherwise inconsistencies in scylla_tables
will not heal.

Refs #2608.
2017-07-20 15:47:10 +02:00
Tomasz Grabiec
6adbe61e2f schema_tables: Extract compact_for_schema_digest() 2017-07-20 15:47:10 +02:00
Tomasz Grabiec
1b85c316bf schema_tables: Always drop scylla_tables::version
It can happen that due to time drift between nodes, the incoming
"version" cell will have higher timestamp than api::new_timestamp().
In such case the column would not be dropped and would cause version
mismatch between nodes.

Ensure it's always covered by using max of current time and cell's
timestamp.

Refs #2608.
2017-07-20 15:47:10 +02:00
Calle Wilund
7a583585a2 system_keyspace: Make sure "system" is written to keyspaces (visible)
Fixes #2514

Bug in schema version 3 update: We failed to write "system" to the
schema tables. Only visible on an empty instance of course.

Message-Id: <1500469809-23546-2-git-send-email-calle@scylladb.com>
2017-07-19 16:18:56 +03:00
Calle Wilund
247c36e048 system_schema: Fix remaining places not handing two system keyspaces
Some places remained where code looked directly at
system_keyspace::NAME to determine iff a ks is
considered special/system/protected. Including
schema digest calculation.

Export "is_system_keyspace" and use accordingly.

Message-Id: <1500469809-23546-1-git-send-email-calle@scylladb.com>
2017-07-19 16:18:45 +03:00
Tomasz Grabiec
dc2dc056a4 schema: Use v3 column layout when converting to/from schema mutations 2017-07-19 09:52:15 +02:00
Tomasz Grabiec
60a76efd37 schema_tables: Store column_name in text form
That's how it is stored by Cassandra.

Refs #2597.
2017-07-17 09:40:06 +02:00
Tomasz Grabiec
5b69d99bf8 schema_tables: Persist table_schema_version
When migrating schema tables from v2 to v3, mutations underlying
table schema will change, and so will their digest. However, we want
the digest to be the same on new nodes as on the old nodes, because
schema exchange is not possible between the two nodes, so they
must to request schema definitions from each other.

The solution is to make the digest persistable, so that it sticks to
given table schema, surviving both migration and node restarts. On
migration from v2, the digest will be calculated from v2 mutations, so
it will be the same on new and old nodes.
2017-07-11 14:52:23 +02:00
Tomasz Grabiec
cdf5b67522 schema_tables: Introduce system_schema.scylla_tables
It will be used to store Scylla spcific table metadata.  We cannot
store it in the standard "tables" table for compatibility reasons -
Cassandra will fail to read schema if it encounteres columns it is not
expecting.
2017-07-11 14:52:23 +02:00
Tomasz Grabiec
cdcdf4772f schema_tables: Simplify read_table_mutations() 2017-07-11 14:52:23 +02:00
Tomasz Grabiec
6e62bc77f1 schema_tables: Resurrect v2 read_table_mutations() 2017-07-11 14:52:23 +02:00
Tomasz Grabiec
18a9e1762c service: Advertise schema tables format version through gossip
Will be needed to inhibit schema exchange on per-peer basis.
2017-07-07 19:07:59 +02:00
Duarte Nunes
4ef25e8e38 db/schema_tables: Add note to make_update_view_mutations
Document that a new view schema passed to make_update_view_mutations()
might be based on base schema that hasn't yet been loaded.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170618200558.96036-1-duarte@scylladb.com>
2017-06-23 15:24:35 +02:00
Avi Kivity
f0b20be14d Revert "system_keyspace: Make sure "system" is written to keyspaces (visible)"
This reverts commit 89ef69c4b3. Prevents nodes
from joining the cluster.
2017-06-21 16:58:04 +03:00
Calle Wilund
89ef69c4b3 system_keyspace: Make sure "system" is written to keyspaces (visible)
Fixes #2514
Bug in schema version 3 update: We failed to write "system" to the
schema tables. Only visible on an empty instance of course.
Message-Id: <1497966982-10044-1-git-send-email-calle@scylladb.com>
2017-06-20 20:59:47 +02:00
Duarte Nunes
b2c5aca4cf db/schema_tables: View mutations shouldn't always include base ones
When making the schema mutations for a view update, we should only
include the base table schema mutations (in case the target node
doesn't contain them) when the view is being directly updated. When it
is being updated as a side effect of updating the base table, then
including the base schema mutations will hide the actual changes being
performed on the base.

Fixes #2500

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1497782822-2711-1-git-send-email-duarte@scylladb.com>
2017-06-18 16:29:59 +03:00
Gleb Natapov
69c5526301 messaging_service: return cache hit ratio as part of data read 2017-06-13 09:57:14 +03:00
Avi Kivity
ebaeefa02b Merge seatar upstream (seastar namespace)
- introcduced "seastarx.hh" header, which does a "using namespace seastar";
 - 'net' namespace conflicts with seastar::net, renamed to 'netw'.
 - 'transport' namespace conflicts with seastar::transport, renamed to
   cql_transport.
 - "logger" global variables now conflict with logger global type, renamed
   to xlogger.
 - other minor changes
2017-05-21 12:26:15 +03:00
Calle Wilund
29b20d410a schema_tables: Remove "class" attribute from strategy options
Not 100% proper, but in line with how we still store the info.
Ensures (helps at least) to keep schema loaded from tables
and schema from builder comparable.

Fixes schema_changes_test error.

Message-Id: <1495030581-2138-2-git-send-email-calle@scylladb.com>
2017-05-17 17:56:11 +03:00
Calle Wilund
6c8b5fc09d schema_tables: Use v3 schema tables and formats
Switches system/schema_* for system_schema/*, updates schema/schema
builder and uses to hold/expect v3 style info (i.e. types & dropped).
2017-05-10 16:44:48 +00:00