Commit Graph

71 Commits

Author SHA1 Message Date
Duarte Nunes
93981aaa93 schema_builder: Ensure dense tables have compact col
This patch ensures that when the schema is dense, regardless of
compact_storage being set, the single regular columns is translated
into a compact column.

This fixes an issue where Thrift dynamic column families are
translated to a dense schema with a regular column, instead of a
compact one.

Since a compact column is also a regular column (e.g., for purposes of
querying), no further changes are required.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1470062410-1414-1-git-send-email-duarte@scylladb.com>
(cherry picked from commit 5995aebf39)

Fixes #1535.
2016-08-03 13:49:51 +02:00
Duarte Nunes
89b40f54db schema: Dense schemas are correctly upgrades
When upgrading a dense schema, we would drop the cells of the regular
(compact) column. This patch fixes this by making the regular and
compact column kinds compatible.

Fixes #1536

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1470172097-7719-1-git-send-email-duarte@scylladb.com>
2016-08-03 13:37:57 +02:00
Duarte Nunes
a647fea30b schema: Add is_dynamic to thrift_schema
This patch adds the is_dynamic() function to thrift_schema, which
tells whether the underlying column family is dynamic or not,
according to thrift rules.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-07-14 15:36:23 +02:00
Paweł Dziepak
c8e75d2e84 schema: cache is_atomic() in column_definition
is_atomic() is called for each cell in mutation applies, compaction
and query. Since the value doesn't change it can be easily cached which
would save one indirection and virtual call.

Results of perf_simple_query -c1 (median, duration 60):
         before      after
read   54611.49   55396.01   +1.44%
write  65378.92   68554.25   +4.86%

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Message-Id: <1465991045-11140-1-git-send-email-pdziepak@scylladb.com>
2016-06-15 19:18:13 +03:00
Pekka Enberg
38a54df863 Fix pre-ScyllaDB copyright statements
People keep tripping over the old copyrights and copy-pasting them to
new files. Search and replace "Cloudius Systems" with "ScyllaDB".

Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>
2016-04-08 08:12:47 +03:00
Paweł Dziepak
7dda3977c6 column_mapping: drop old-style serializers
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-02-19 23:11:59 +00:00
Paweł Dziepak
c55fa9e4c2 schema: make column_mapping serializer-friendly
- unnested column_mapping::column
- more accessors

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-02-19 23:11:16 +00:00
Paweł Dziepak
17ca7e06f3 schema: print collection info
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-01-19 09:39:12 +01:00
Paweł Dziepak
2e2de35dfb schema: add _raw._collections check to operator==()
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-01-19 09:39:08 +01:00
Paweł Dziepak
92dc95b73b schema: fix comparator parsing
The correct format of collection information in comparator is:

o.a.c.db.m.ColumnToCollection(<name1>:<type1>, <name2>:<type2>, ...)

not:

o.a.c.db.m.ColumnToCollection(<name1>:<type1>),
o.a.c.db.m.ColumnToCollection(<name2>:<type2>) ...

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-01-19 09:39:05 +01:00
Paweł Dziepak
4927ff95da schema: read collections from comparator
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-01-18 08:35:33 +01:00
Paweł Dziepak
6372a22064 schema: use _raw._collections to generate comparator name
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-01-18 08:35:03 +01:00
Paweł Dziepak
84840c1c98 schema: keep track of removed collections
Cassandra disallows adding a column with the same name as a collection
that existed in the past in that table if the types aren't compatible.
To enforce that Scylla needs to keep track of all collections that ever
existed in the column family.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-01-18 08:34:29 +01:00
Tomasz Grabiec
facc549510 schema: Introduce equal_columns() 2016-01-11 10:34:55 +01:00
Paweł Dziepak
b5bee9c36a schema_builder: force column id recomputation in build()
If the schema_builder is constructed from an existing schema we need to
make sure that the original column ids of regular and static columns are
*not* used since they may become invalid if columns are added or
removed.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-01-11 10:34:54 +01:00
Paweł Dziepak
da0f999123 schema_builder: add with_altered_column_type()
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-01-11 10:34:54 +01:00
Paweł Dziepak
9807ddd158 schema_builder: add with_column_rename()
Columns that are part of the primary key can be renamed.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-01-11 10:34:54 +01:00
Paweł Dziepak
3cbfa0e52f schema: add column_definition::_dropped_at
When a column is dropped its name and deletion timestamp are added
to schema::_raw._dropped_columns to prevent data resurrection in case a
column with the same name is added. To reduce the number of lookups in
_dropped_columns this patch makes each instance of column_definition
to caches this information (i.e. timestamp of the latest removal of a
column with the same name).

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-01-11 10:34:53 +01:00
Paweł Dziepak
42dc4ce715 schema: keep track of dropped columns
Knowing which columns were dropped (and when) is important to prevent
the data from the dropped ones reappearing if a new column is added with
the same name.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-01-11 10:34:53 +01:00
Tomasz Grabiec
fb5658ede1 schema_registry: Track synced state of schema
We need to track which schema version were synced with on current node
to avoid triggering the sync on every mutation. We need to sync before
mutating to be able to apply the incoming mutation using current
node's schema, possibly applying irreverdible transformations to it to
make it conform.
2016-01-11 10:34:52 +01:00
Tomasz Grabiec
f25487bc1e Introduce schema_registry 2016-01-11 10:34:51 +01:00
Tomasz Grabiec
b17cbc23ab schema: Introduce column_mapping
Encapsulates information needed to convert mutation representations
between schema versions.
2016-01-08 21:10:26 +01:00
Tomasz Grabiec
f3556ebfc2 schema: Introduce column_count_type
Right now in some places we use column_id, and in some places
size_t. Solve it by using column_count_type whose meaning is "an
integer sufficiently large for indexing columns". Note that we cannot
use column_id because it has more meaning to it than that.
2016-01-08 21:10:26 +01:00
Tomasz Grabiec
f58c2dec1e schema: Make schema objects versioned
The version needs to change value not only on structural changes but
also temporal. This is needed for nodes to detect if the version they
see was already synchronized with or not even if it has the same
structure as the past versions. We also need to end up with the same
version on all nodes when schema changes are commuted.

For regular mutable schemas version will be calculated from underlying
mutations when schema is announced. For static schemas of system
keyspace it is calculated by hashing scylla version and column id,
because we don't have mutations at the time of building the schema.
2016-01-08 21:10:26 +01:00
Tomasz Grabiec
13295563e0 schema_builder: Move compact_storage setting outside build()
Properties of the schema are set using methods of schema_builder and
different variants of build() are for different forms of the final
schema object.
2016-01-08 21:10:26 +01:00
Tomasz Grabiec
75caba5b8a schema: Guarantee that column id order matches name order
For static and regular (row) columns it is very convenient in some
cases to utilize the fact that columns ordered by ids are also ordered
by name. It currently holds, so make schema export this guarantee and
enable consumers to rely on.

The static schema::row_column_ids_are_ordered_by_name field is about
allowing code external to schema to make it very explicit (via
static_assert) that it relies on this guarantee, and be easily
discoverable in case we would have to relax this.
2016-01-08 21:10:25 +01:00
Paweł Dziepak
a5a744655e schema: do not add frozen collections to compound name
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-01-05 10:49:32 +01:00
Paweł Dziepak
ed7d9d4996 schema: change has_collections() to has_multi_column_collections()
All users of schema::has_collections() aren't really interested in
frozen ones.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-01-05 10:46:42 +01:00
Tomasz Grabiec
e2037ebc62 schema: Fix operator==() to include missing fields 2015-12-16 18:06:55 +01:00
Paweł Dziepak
8fd4b9f911 schema: remove _clustering_key_prefix_type
All clustering keys are now prefixable.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2015-12-11 10:47:24 +01:00
Tomasz Grabiec
f3f2bf0b44 schema: Move definitions to source file 2015-11-12 13:50:01 +02:00
Avi Kivity
d5cf0fb2b1 Add license notices 2015-09-20 10:43:39 +03:00
Tomasz Grabiec
ba35788817 mutation_partition: De-templetize methods
Instead of accepting a column resolver callable, accept a schema and
column_kind or column_selector. Makes the interface easier to use and
enables us to move implementation to .cc file.
2015-09-06 21:25:44 +02:00
Pekka Enberg
7c9eeb519a schema: Add operator<< for 'schema'
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-31 13:35:26 +03:00
Pekka Enberg
ae9e3e049c schema: Improve column_definition operator<< output
Make operator<< for column_definition print more information.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-31 13:35:26 +03:00
Pekka Enberg
61d7e8de1c schema: Add to_string() for column_kind and index_type enums
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-31 13:35:26 +03:00
Glauber Costa
4cfe7de292 schema: correctly compound collections
We are currently using the ColumnToCollectionType wrongly: we are wrapping
by that string to every collection. But that is not how Origin operates: a single
ColumnToCollectionType hosts all collections a schema has.

Funny enough, sstable2json seems to work all right without any comparator - and
that is how it worked before, but when a comparator is present, it expects it to
abide by what Origin expects. That causes us to crash.

Fixes #148

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-18 10:40:04 +03:00
Glauber Costa
d1f897b63b cell_name: always include default comparator for version 2.1.8 and lower
This is the biggest change from 2.2: for the 2.1 series, the default type is
always stored in the comparator for compound types.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 09:30:54 -05:00
Glauber Costa
f33f432474 schema: set is_all_components for compact columns
They should be set. As a result, those columns will have the index "null"
at the schema_columns table.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 09:30:54 -05:00
Glauber Costa
21ebaeffae schema_builder: provide a build function that doesn't take compact storage.
We will invoke the schema builder from schema_tables.cc, and at that point, the
information about compact storage no longer exists anywhere. If we just call it
like this, it will be the same as calling it with compact_storage::no, which
will trigger a (wrong) recomputation for compact_storage::yes CFs

The best way to solve that, is make the compact_storage parameter mandatory
every time we create a new table - instead of defaulting to no. This will
ensure that the correct dense and compound calculation are always done when
calling the builder with a parameter, and not done at all when we call it
without a parameter.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 09:30:54 -05:00
Glauber Costa
6d7a3d2f0a schema: always rebuild the schema
If we alter the compound property, we also have to rebuild the schema,
since some aspects of the columns depend on it. Let's just go ahead and
always rebuild the schema.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 09:30:53 -05:00
Glauber Costa
68bdb1d0c1 schema: set thrift properties in schema, not schema builder
We will use those properties during initialization - for instance, to calculate
thrift_bits.is_on_all_components. In order to do that, it has to be available at
schema creation, and not through the schema builder.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 09:30:53 -05:00
Avi Kivity
be32746c58 Merge "Handle Compact Storage" from Glauber
"This is my current proposal for Compact Storage tables - plus
the needed infrastructure.

Getting rid of the CellName abstraction allows us to simplify
things by quite a lot: now all we need is to mark whether or
not a table is composite, and provide functions to play the
role of the comparator when dealing with the strings."
2015-07-23 16:20:31 +03:00
Glauber Costa
4e83530c3f do not "throw new"
This is how Java does. But in C++, "throw new", although valid, would require
the catcher to catch a pointer to the exception - which isn't really what we
do.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-23 07:07:17 +03:00
Glauber Costa
5cc955d69c comparator: functions to manipulate a compound type
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-22 23:10:21 -04:00
Glauber Costa
ddf6a2d8d5 schema: add a new column_kind
Origin has another column_kind, that we lack: compact_value. This kind is
used to identify regular columns of dense tables.

Take for instance, the following table:

CREATE TABLE ks2.compact (
    ks text,
    cl1 text,
    cl2 text,
    PRIMARY KEY (ks, cl1)
) WITH COMPACT STORAGE

cqlsh> select keyspace_name, columnfamily_name, column_name, type from system.schema_columns \
       where keyspace_name='ks2' and columnfamily_name='compact';

 keyspace_name | columnfamily_name | column_name | type
---------------+-------------------+-------------+----------------
           ks2 |           compact |         cl1 | clustering_key
           ks2 |           compact |         cl2 |  compact_value
           ks2 |           compact |          ks |  partition_key

We will treat those columns as regular columns for most purposes. Because of
that, we don't need to separate them from the regular columns when we sort
initially, for instance. All we have to do is change its type.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-22 23:10:21 -04:00
Glauber Costa
66a10c3b38 schema: add empty column for dense tables that do not have a regular column
This is how it happens for Origin. Take for instance the following CF:

CREATE TABLE ks2.noregular_cs2 (
    ks text,
    cl1 text,
    cl2 text,
    PRIMARY KEY (ks, cl1, cl2)
) WITH COMPACT STORAGE;

cqlsh> select keyspace_name, columnfamily_name, column_name from system.schema_columns \
       where keyspace_name='ks2' and columnfamily_name='noregular_cs2';

 keyspace_name | columnfamily_name | column_name
---------------+-------------------+-------------
           ks2 |     noregular_cs2 |                <===== added this.
           ks2 |     noregular_cs2 |         cl1
           ks2 |     noregular_cs2 |         cl2
           ks2 |     noregular_cs2 |          ks

In order to achieve that, we need to relax the test in db/legacy_schema_tables.cc.
It will throw in case it finds an empty name.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-22 23:10:21 -04:00
Glauber Costa
10436c29c1 thrift: implement compound comparator test
Now that we do that for the main schema, we can just copy the result for
thrift.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-22 23:10:20 -04:00
Glauber Costa
aa4b1dcc58 schema: add a field for compound
We are deviating a bit from Origin here: In Origin, we would store a full
comparator class. However, due to the fact that our types are very different,
and as a consequence we will not call a serializer directly on the cell name,
that is not necessary.

The only information that we will need to store is whether or not the table is
compound. Some functions to manipulate it will be presented in the next patch.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-22 23:09:05 -04:00
Glauber Costa
cb94f3f27e schema_builder: calculate is_dense from the schema builder
We currently have code to calculate "is_dense" in the create statement handler.
That obviously don't work for the system schemas, which are not defined this
way.

Since all of our schemas now have to pass through the schema_builder one way or
another, that is the best place in which to do that calculation.

Note that unfortunately, that does not mean we can just get rid of
set_is_dense() in the schema builder: we still need to set it in some
situations, where for instance, we read that property in schema_columnfamilies,
and then apply to the relevant CF. Those uses are, however, all internal to
legacy_schema_tables.cc

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-22 23:09:05 -04:00