Commit Graph

11025 Commits

Author SHA1 Message Date
Paweł Dziepak
6db262446f storage_proxy: don't stop after result with no live rows
mutation_result_merger merges results from different shards and stops as
soon as a shard returned a short read or memory usage on the merging
shard is too high. However, it should never stop unless at least one
live rows is in the merged result.
2016-12-22 13:35:04 +01:00
Avi Kivity
74ecd7072a Merge "Reduce overhead of get_max_purgeable_timestamp() during compaction" from Tomasz
* 'tgrabiec/calculate-hash-once-compaction' of github.com:cloudius-systems/seastar-dev:
  sstables: Calculate key hash only once during compaction
  tests: sstables: Add more test cases to tombstone_purge_test
  db: Expose column_family::add_sstable
  tests: sstables: Ensure timestamps are increasing
  tests: sstables: Simplify tombstone_purge_test
2016-12-22 14:33:30 +02:00
Tomasz Grabiec
045b9fd7c1 sstables: Calculate key hash only once during compaction
Improves compaction performance.
2016-12-22 13:24:46 +01:00
Tomasz Grabiec
fb8765bef9 tests: sstables: Add more test cases to tombstone_purge_test 2016-12-22 13:24:46 +01:00
Tomasz Grabiec
c7ff2a2bb0 db: Expose column_family::add_sstable
Needed by compaction tests.
2016-12-22 13:24:46 +01:00
Tomasz Grabiec
d841cab02c tests: sstables: Ensure timestamps are increasing 2016-12-22 13:24:45 +01:00
Tomasz Grabiec
21ade8e4a4 tests: sstables: Simplify tombstone_purge_test
- moved to seastar thread

  - extracted sstable creation and validation logic

  - reduced code duplication

  - switched to mutation_reader assertions

  - used result of compact_sstable() to locate the new sstable

  - rather than setting gc timestamp in the past, bump the clock
    before compacting
2016-12-22 13:24:41 +01:00
Tomasz Grabiec
bc6486b304 Use gc_clock instead of db_clock where possible
Some code paths were obtaining db_clock timestamp to only convert it
to gc_clock later. Avoid this. In the future we could make gc_clock
cheaper cause it has low precision.

Message-Id: <1482401190-2035-1-git-send-email-tgrabiec@scylladb.com>
2016-12-22 13:27:55 +02:00
Raphael S. Carvalho
c26090a6b2 sstables/compress: fix error message for snappy uncompression
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <898ad07db705355bdbf780afdb3aa982b8ca3823.1482364125.git.raphaelsc@scylladb.com>
2016-12-22 09:08:34 +01:00
Raphael S. Carvalho
27fb8ec512 db: avoid excessive disk usage during sstable resharding
Shared sstables will now be resharded in the same order to guarantee
that all shards owning a sstable will agree on its deletion nearly
the same time, therefore, reducing disk space requirement.
That's done by picking which column family to reshard in UUID order,
and each individual column family will reshard its shared sstables
in generation order.

Fixes #1952.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <87ff649ed24590c55c00cbb32bffd8fa2743e36e.1482342754.git.raphaelsc@scylladb.com>
2016-12-21 23:18:06 +02:00
Tomasz Grabiec
d87d50dc64 db: Use microsecond precision for server-side timestamps
Currently server-side timestamps use a clock with millisecond
precision. Timestamps have microsecond resolution, with lower bits
used to serialize mutations originating from given client.

Timestamps for column drops always use just the millisecond base. A
column drop which is executed after an insert may thus be given lower
timestamp than the insert, even when the two are serialized on the
client side over same connection.

Use microsecond precision to reduce chances of that event.

This is supposed to fix sporadic failures of
schema_test.py:TestSchema.drop_column_queries_test dtest.
Message-Id: <1482343119-27698-1-git-send-email-tgrabiec@scylladb.com>
2016-12-21 18:03:22 +00:00
Avi Kivity
875635554d Merge "educe overhead of partition presence checker during cache update" from Tomasz
Refs #1943.

* 'tgrabiec/optimize-bloom-filter' of github.com:cloudius-systems/seastar-dev:
  db: Compute key hash once in partition_presence_checker
  bloom_filter: Allow checking presence using pre-hashed key
  db: Use incremental selector in partition_presence_checker
2016-12-21 14:24:54 +02:00
Takuya ASADA
d356c21512 configure.py: don't allow to run multiple 'ninja -C seastar' on same time
Scylla's build.ninja allows to run multiple 'ninja -C seastar' on same time,
it breaks DPDK build after upgraded to DPDK-16.10:
https://gist.github.com/syuu1228/4bd1170630b7e5f15653281b4728e521

To prevent it, we need to limit number of seastar build only one in same time.

Note: it doesn't mean disabling parallel build on Seastar.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1482250560-20289-1-git-send-email-syuu@scylladb.com>
2016-12-21 12:42:52 +02:00
Vlad Zolotarov
62cad0f5f5 tracing: don't start tracing until a Tracing service is fully initialized
RPC messaging service is initialized before the Tracing service, so
we should prevent creation of tracing spans before the service is
fully initialized.

We will use an already existing "_down" state and extend it in a way
that !_down equals "started", where "started" is TRUE when the local
service is fully initialized.

We will also split the Tracing service initialization into two parts:
   1) Initialize the sharded object.
   2) Start the tracing service:
      - Create the I/O backend service.
      - Enable tracing.

Fixes issue #1939

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <1481836429-28478-1-git-send-email-vladz@scylladb.com>
2016-12-21 12:40:14 +02:00
Gleb Natapov
0a2dd39c75 messaging_service: move MUTATION_DONE messages to separate connection
If a node gets more MUTATION request that it can handle via RPC it will
stop reading from this RPC connection, but this will prevent it from
getting MUTATION_DONE responses for requests it coordinates because
currently MUTATION and MUTATION_DONE messages shares same connection.

To solve this problem this patches moves MUTATION_DONE messages to
separate connection.

Fixes: #1843

Message-Id: <20161201155942.GC11581@scylladb.com>
2016-12-21 11:10:15 +02:00
Piotr Jastrzebski
3e502de153 mutation_partition: don't use unique_ptr to manage LSA objects
Unique_ptr won't destruct them correctly.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <5b49bb25a962432a178fe75554dd010c3cdea41d.1482261888.git.piotr@scylladb.com>
2016-12-21 09:40:15 +01:00
Raphael S. Carvalho
e28537b56f sstables: fix calculation of memory footprint for summary
size of keys weren't taken into account, so value reported
via collectd is much smaller than actual footprint.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <3ca24612e4e84d1cbdea4f2d79e431a4f4479291.1482255327.git.raphaelsc@scylladb.com>
2016-12-20 18:28:47 +00:00
Paweł Dziepak
d0e61fd092 test.py: remove '.cc' from view_schema_test 2016-12-20 18:26:52 +00:00
Avi Kivity
3989e4ed15 Revert "config, dht: reduce default msb ignore bits to 4"
This reverts commit b81a57e8eb.

With exponential range scanning, we should now be able to survive
msb ignore bits of 12, which allows better sharding on large clusters.
2016-12-20 19:41:05 +02:00
Duarte Nunes
a9e5b7f124 view_info: Fix comparison
Two view_info object are equal if their fields are equal, not
different.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1482253839-2736-1-git-send-email-duarte@scylladb.com>
2016-12-20 18:36:39 +01:00
Avi Kivity
a1cafed370 storage_proxy: handle range scans of sparsely populated tables
When murmur3_partitioner_ignore_msb_bits = 12 (which we'd like to be the
default), a scan range can be split into a large number of subranges, each
going to a separate shard.  With the current implementation, subranges were
queried sequentially, resulting in very long latency when the table was empty
or nearly empty.

Switch to an exponential retry mechanism, where the number of subranges
queried doubles each time, dropping the latency from O(number of subranges)
to O(log(number of subranges)).

If, during an iteration of a retry, we read at most one range
from each shard, then partial results are merged by concatentation.  This
optimizes for the dense(r) case, where few partial results are required.

If, during an iteration of a retry, we need more than one range per
shard, then we collapse all of a shard's ranges into just one range,
and merge partial results by sorting decorated keys.  This reduces
the number of sstable read creations we need to make, and optimizes for
the sparse table case, where we need many partial results, most of which
are empty.

We don't merge subranges that come from different partition ranges,
because those need to be sorted in request order, not decorated key order.

[tgrabiec: trivial conflicts]

Message-Id: <20161220170532.25173-1-avi@scylladb.com>
2016-12-20 18:32:29 +01:00
Tomasz Grabiec
dc94bd0642 Merge branch 'materialized-views/cql/v4' from git@github.com:duarten/scylla.git
This patchset implements the multiple CQL3 statements relating to
materialized views, as well as ensuring other statements now take
materialized views into account. It also adds the necessary internal
data structures to hold materialized view metadata.
2016-12-20 14:21:18 +01:00
Duarte Nunes
8ac4d7b2e8 tests: Add view_schema_test
This patch adds a set of tests for materialized view schema
handling, complementing the dtests for the same feature.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
eb25a8f3cd cql_test_env: Add do_with_cql_env_thread function
This patch introduces the do_with_cql_env_thread() function, which
behaves like do_with_cql_env() except that it executes the
user-specified function in the context of a Seastar thread.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
124802e196 cql3: Add function to build view's select statement
This patch adds an utility function that creates a raw select
statement from a set of columns and a where clause. It is intended to
be used to create the prepared select statement used by the view
class.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
088dfdb108 select_statement: Consider materialized views
This patch considers materialized views in
select_statement::check_access().

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
5511dab914 cql3: Add drop view statement
This patch adds the drop_view_statement, which enables users to drop a
given materialized view.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
5c51a24217 cql3: Parse drop view statement
This patch adds the necessary grammar to Cql.g to parse drop view
statements.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
3025ea63fc cql3: Add alter view statement
This patch adds the alter_view_statement, which enables users to
change the properties of a materialized view.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
71b1e7c056 cql3: Parse alter view statement
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
8792fed651 create_view_statement: Complete implementation
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
02bc0d2ab3 create_view_statement: Require MV feature
This patch adds the MATERIALIZED_VIEWS_FEATURE to the set of cluster
features and requires its presence to allow creating a view. This
ensures view schemas can be safely propagated across nodes.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
59682c95a1 create_view_statement: Require experimental switch
Creating a materialized view requires running Scylla with the
experimental switch.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
c626c983f4 create_view_statement: Reuse validation code
This replace some validation logic with a call to
validation::validate_column_family.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
5bd74abee8 create_view_statement: Implement check_access
This patch implements check_access according to Cassandra's
implementation.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
a9c17b0a52 select_statement: Propagate for_view argument
This patch propagates the for_view argument, used by
statement_restrictions to ensure IS NOT NULL can be used when creating
a materialized view.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
65535b3444 modification_statement: Check access for tables with views
This patch checks for additional permissions when modifying a table
with views, since that update will require reading from the table and
writing into its views.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
5187fdbb3a modification_statement: Views aren't updated directly
This patch ensures that views cannot be modified directly through an
insert or update statement.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
21e34c5054 alter_type_statement: Consider materialized views
This patch ensures we also update materialized views where the type
being updated occurs.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
a5b7b0464b migration_manager: Only drop table without views
This patch forbids dropping a column family if there are still views
associated with it, and also forbids dropping a view through the drop
table statement.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
76276f1a53 alter_table_statement: Update materialized view
This patch ensures that changes to a base table's schema
are reflected in that table's materialized views.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
44a1f2d836 query_processor: Use cql3::util::do_with_parser()
To minimize code duplication, have query_processor use
do_with_parser() instead of manually creating the CqlParser.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
bd1e66f411 cql3: Allow renaming a column in a where clause
This patch adds an utility function to rename a column occurring a
textual where clause. It is intended to change a view's where clause
when users alter the underlying base table.

To do this, we rely on functions that transform a textual where clause
into a set of relations, which allows to reliably rename the  column.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
ced4b6e4ff cql3: Allow renaming an identifier in a relation
This patch adds an utility function to rename an identifier
occurring in a cql3 relation. This function will be used when renaming
an identifier in a view's where clause.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
282c023524 migration_manager: Announce view drop
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
99aa8eb4b8 migration_manager: Announce view update
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
6ef3358321 migration_manager: Announce new view creation
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
8ce21a9c01 schema_tables: Make drop view mutations
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
61a5a74ea2 schema_tables: Make update view mutations
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00
Duarte Nunes
2098c336d9 schema_tables: Make create view mutations
This patch builds the mutations to announce a new view. Aside from
including the view schema, we include the base table mutations so
that a node is resilient against receiving create view mutations
before the base table create mutations.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-20 13:06:11 +00:00