Commit Graph

27 Commits

Author SHA1 Message Date
Glauber Costa
e0bfd1c40a allow Cassandra SSTables with counters to be imported if they are new enough
Right now Cassandra SSTables with counters cannot be imported into
Scylla.  The reason for that is that Cassandra changed their counter
representation in their 2.1 version and kept transparently supporting
both representations.  We do not support their old representation, nor
there is a sane way to figure out by looking at the data which one is in
use.

For safety, we had made the decision long ago to not import any
tables with counters: if a counter was generated in older Cassandra, we
would misrepresent them.

In this patch, I propose we offer a non-default way to import SSTables
with counters: we can gate it with a flag, and trust that the user knows
what they are doing when flipping it (at their own peril). Cassandra 2.1
is by now pretty old. many users can safely say they've never used
anything older.

While there are tools like sstableloader that can be used to import
those counters, there are often situations in which directly importing
SSTables is either better, faster, or worse: the only option left.  I
argue that having a flag that allow us to import them when we are sure
it is safe is better than having no option at all.

With this patch I was able to successfully import Cassandra tables with
counters that were generated in Cassandra 2.1, reshard and compact their
SSTables, and read the data back to get the same values in Scylla as in
Cassandra.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20190210154028.12472-1-glauber@scylladb.com>
2019-02-10 17:50:48 +02:00
Rafael Ávila de Espíndola
625080b414 Rename large_partition_handler
Now that it also handles large rows, rename it to large_data_handler.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-01-28 15:03:14 -08:00
Piotr Sarna
5dec6dc6c6 table: make populate_views not allow hints
View building uses populate_views to generate and send view updates.
This procedure will now not allow hints to be used to acknowledge
the write. Instead, the whole building step will be retried on failure.

Fixes #3857
Fixes #4039
2019-01-28 09:38:42 +01:00
Avi Kivity
fae4c6c0b6 database: merge for_all_partitions and for_all_partitions_slow
for_all_partitions is only used in the implementation of for_all_partitions_slow,
so merge them and get rid of a template.
2019-01-20 15:55:20 +02:00
Avi Kivity
6e6372e8d2 Revert "Merge "Type-eaese gratuitous templates with functions" from Avi"
This reverts commit 31c6a794e9, reversing
changes made to 4537ec7426. It causes bad_function_calls
in some situations:

INFO  2019-01-20 01:41:12,164 [shard 0] database - Keyspace system: Reading CF sstable_activity id=5a1ff267-ace0-3f12-8563-cfae6103c65e version=d69820df-9d03-3cd0-91b0-c078c030b708
INFO  2019-01-20 01:41:13,952 [shard 0] legacy_schema_migrator - Moving 0 keyspaces from legacy schema tables to the new schema keyspace (system_schema)
INFO  2019-01-20 01:41:13,958 [shard 0] legacy_schema_migrator - Dropping legacy schema tables
INFO  2019-01-20 01:41:14,702 [shard 0] legacy_schema_migrator - Completed migration of legacy schema tables
ERROR 2019-01-20 01:41:14,999 [shard 0] seastar - Exiting on unhandled exception: std::bad_function_call (bad_function_call)
2019-01-20 11:32:14 +02:00
Avi Kivity
f61dbc9855 database: merge for_all_partitions and for_all_partitions_slow
for_all_partitions is only used in the implementation of for_all_partitions_slow,
so merge them and get rid of a template.
2019-01-17 18:50:36 +02:00
Duarte Nunes
04a14b27e4 Merge 'Add handling staging sstables to /upload dir' from Piotr
"
This series adds generating view updates from sstables added through
/upload directory if their tables have accompanying materialized views.
Said sstables are left in /upload directory until updates are generated
from them and are treated just like staging sstables from /staging dir.
If there are no views for a given tables, sstables are simply moved
from /upload dir to datadir without any changes.

Tests: unit (release)
"

* 'add_handling_staging_sstables_to_upload_dir_5' of https://github.com/psarna/scylla:
  all: rename view_update_from_staging_generator
  distributed_loader: fix indentation
  service: add generating view updates from uploaded sstables
  init: pass view update generator to storage service
  sstables: treat sstables in upload dir as needing view build
  sstables,table: rename is_staging to requires_view_building
  distributed_loader: use proper directory for opening SSTable
  db,view: make throttling optional for view_update_generator
2019-01-15 18:19:27 +00:00
Piotr Sarna
09401e0e71 sstables,table: rename is_staging to requires_view_building
A generalized name will be more fitting once we treat uploaded sstables
as requiring view building too.
2019-01-15 16:47:01 +01:00
Piotr Sarna
d3a8fb378c table: wait for pending streams on table::stop
Stream sessions are now phased, so it's possible to wait for existing
streams to finish gently before stopping a table.
2019-01-15 09:36:55 +01:00
Benny Halevy
238866228f memtable: rename get_stats to get_encoding_stats
For symmetry reasons to similar sstable and compaction methods.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20190113105155.29118-2-bhalevy@scylladb.com>
2019-01-14 14:58:43 +02:00
Avi Kivity
391d1e0fe0 table: const correctness for table::get_sstables() and related
Do not allow write access to the sstable list via this accessor. Luckily
there are no violations, and now we enforce it.
Message-Id: <20190111151049.16953-1-avi@scylladb.com>
2019-01-11 17:39:17 +01:00
Raphael S. Carvalho
1b7cad3531 database: Fix race condition in sstable snapshot
Race condition takes place when one of the sstables selected by snapshot
is deleted by compaction. Snapshot fails because it tries to link a
sstable that was previously unlinked by compaction's sstable deletion.

Fixes #4051.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20190110194048.26051-1-raphaelsc@scylladb.com>
2019-01-11 07:53:14 +02:00
Avi Kivity
b247ce01c3 table: restore indentation after changes to table::make_sstable_reader
Message-Id: <20190109175804.9352-2-avi@scylladb.com>
2019-01-10 13:00:53 +01:00
Avi Kivity
3d6be2f822 table: reduce duplication in table::make_sstable_reader
make_sstable_reader needs to deal with single-key and scanning reads, and
with restricting and non-restricting (in terms of read concurrency) readers.
Right now it does this combinatorically - there are separate cases for
restricting single-key reads, non-restricting single-key reads, restricing
scans, and non-restricting scans.

This makes further changes more complicated, so separate the two concepts.
The patch splits the code into two stages; the first selects between a single-key
and a scan, and the second selects between a restricting and non-restricting read.

This slightly pessimizes non-restricting reads (a mutation_source is created and
immediately destroyed), but that's not the common case.

Tests: unit(release)
Message-Id: <20190109175804.9352-1-avi@scylladb.com>
2019-01-10 13:00:40 +01:00
Raphael S. Carvalho
f5301990fc compaction: release reference of cleaned sstable in compaction manager
Compaction manager holds reference to all cleaning sstables till the very
end, and that becomes a problem because disk space of cleaned sstables
cannot be reclaimed due to respective file descriptors opened.

Fixes #3735.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20181221000941.15024-1-raphaelsc@scylladb.com>
2019-01-08 14:14:01 +02:00
Piotr Sarna
c5346cdf9b database, table: split table-related code to table.cc
All table:: related code is moved to table.cc source file,
which splits database.cc size in half and thus allows
faster compilation on multiple cores.

Refs #1

Message-Id: <28e67f7793ff2147ffce18df5e0b077e14d3b8bd.1546940360.git.sarna@scylladb.com>
2019-01-08 12:02:42 +02:00
Duarte Nunes
86198060e5 database: generate_and_propagate_view_updates no longer needs a timeout
We no longer wait on the semaphore and instead over-subscribe it, so
there's not reason to pass a timeout.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2018-12-19 22:38:29 +00:00
Duarte Nunes
39eda68094 database: Don't generate view updates when node is overloaded
We arrive at an overloaded state when we fail to acquire semaphore
units in the base replica. This can mean clients are working in
interactive mode, we fail to throttle them and consequently should
start shedding load. We want to avoid impacting base table
availability by running out of memory, so we could offload the memory
queue to disk by writing the view updates as hints without attempting
to send them. However, the disk is also a limited resource and in
extreme cases we won’t be able to write hints. A tension exists
between forgetting the view updates, thereby opening up a window for
inconsistencies between base and view, or failing the base replica
write. The latter can fail the whole user write, or if the
coordinator was able to achieve CL, can instead cause inconsistencies
between base tables (we wouldn't want to store a hint, because if the
base replica is still overloaded, we would redo the whole dance).

Between the devil and the deep blue sea, we chose to forget view
updates. As a further simplification, we don't even write hints,
assuming that if clients can’t be throttled (as we'll attempt to do in
future patches), it will only be a matter of time before view updates
can’t be offloaded. We also start acquiring the semaphore units using
consume(), which is non-blocking, but allows for underflow of the
available semaphore units. This is okay, and we expect not to underflow
by much, as we stop generating new view updates.

Refs #2538

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2018-12-19 22:38:29 +00:00
Paweł Dziepak
9024187222 partition_slice: use small_vector for column_ids 2018-12-06 14:21:04 +00:00
Piotr Sarna
ed05d91adc db/view: add view updating consumer
This consumer is used to generate and push view replica updates
from read mutations.
2018-11-13 14:54:39 +01:00
Piotr Sarna
348fa3b092 table: add stream_view_replica_updates
Generating view replica updates during streaming ignores
the staging sstable that is used to generate them.
2018-11-13 14:52:22 +01:00
Piotr Sarna
fed9c59eb8 table: split push_view_replica_updates
push_view_replica_updates is split in order to allow different
mutation source to be provided.
2018-11-13 14:52:22 +01:00
Piotr Sarna
466d780445 table: add as_mutation_source_excluding
A variant of table::as_mutation_source that allows excluding
a single sstable is added.
2018-11-13 14:52:22 +01:00
Piotr Sarna
c825a17b9d table: move push_view_replica_updates to table.cc 2018-11-13 14:52:22 +01:00
Piotr Sarna
e88b85134c database: add sstable-excluding reader
When generating view updates from a staging sstable, this sstable
should not be used in the process. Hence, a reader that skips a single
sstable is added.
2018-11-13 14:52:22 +01:00
Piotr Sarna
160a6d58d2 table: add move_sstable_from_staging_in_thread function
After materialized view updates are generated, the sstable
should be moved from staging/ to a regular directory.
It's expected to be called from seastar::async thread context.
2018-11-13 11:45:30 +01:00
Piotr Sarna
788e03433c table: init table.cc file
This file will be used to move table-related functions to it.
2018-11-13 11:45:30 +01:00