Commit Graph

31 Commits

Author SHA1 Message Date
Pavel Solodovnikov
219ac2bab5 large_data_handler: fix segmentation fault when constructing data_value from a nullptr
It turns out that `cql_table_large_data_handler::record_large_rows`
and `cql_table_large_data_handler::record_large_cells` were broken
for reporting static cells and static rows from the very beginning:

In case a large static cell or a large static row is encountered,
it tries to execute `db::try_record` with `nullptr` additional values,
denoting that there is no clustering key to be recorded.

These values are next passed to `qctx.execute_cql()`, which
creates `data_value` instances for each statement parameter,
hence invoking `data_value(nullptr)`.

This uses `const char*` overload which delegates to
`std::string_view` ctor overload. It is UB to pass `nullptr`
pointer to `std::string_view` ctor. Hence leading to
segmentation faults in the aforementioned large data reporting
code.

What we want here is to make a null `data_value` instead, so
just add an overload specifically for `std::nullptr_t`, which
will create a null `data_value` with `text` type.

A regression test is provided for the issue (written in
`cql-pytest` framework).

Tests: test/cql-pytest/test_large_cells_rows.py

Fixes: #6780

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20201223204552.61081-1-pa.solodovnikov@scylladb.com>
2020-12-24 11:37:43 +02:00
Benny Halevy
64a4ffc579 large_data_handler: do not delete records in the absence of large_data_stats
The previous way of deleting records based on the whole
sstatble data_size causes overzealous deletions (#7668)
and inefficiency in the rows cache due to the large number
of range tombstones created.

Therefore we'd be better of by juts letting the
records expire using he 30 days TTL.

Test: unit(dev)
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20201206083725.1386249-1-bhalevy@scylladb.com>
2020-12-06 11:34:37 +02:00
Benny Halevy
4406a2514e large_data_handler: maybe_delete_large_data_entries: use sstable large data stats
If the sstable has scylla_metadata::large_data_stats use them
to determine whether to delete the corresponding large data records.

Otherwise, defer to the current method of comparing the sstable
data_size to the respective thresholds.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-12-01 15:19:42 +02:00
Benny Halevy
8cebe7776f large_data_handler: maybe_delete_large_data_entries: accept shared_sstable
Since the actual deletion if the large data entries
is done in the background, and we don't captures the shared_sstable,
we can safely pass it to maybe_delete_large_data_entries when
deleting the sstable in sstable::unlink and it will be release
as soon as maybe_delete_large_data_entries returns.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-12-01 15:19:42 +02:00
Benny Halevy
f7d0ae3d10 large_data_handler: maybe_delete_large_data_entries: move out of line
It is called on the cold path, when the sstable is deleted.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-12-01 15:19:42 +02:00
Benny Halevy
dd7422a713 large_data_handler: indicate recording of large data entries
Return true from the maybe_{record,log}_* methods if
a large data record or log entry were emitted.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-12-01 15:18:14 +02:00
Benny Halevy
873107821b large_data_handler: move constructor out of line
No need for it to be inlined.

Also, add debug logging to the large data handler options.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-12-01 15:18:14 +02:00
Pavel Emelyanov
303ebe4a36 code: Use qctx::evecute_cql methods, not global ones
There are global db::execute_cql() helpers that just forward
the args into qctx::execute_cql(). The former are going away,
so patch all callers to use qctx themselves.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-11-19 18:39:05 +03:00
Rafael Ávila de Espíndola
5d4671526c db: Replace large_data_handler::_stopped with _running
This is not just a direct flip to a variable with the negated Boolean
value. When created, a large_data_handler is not considered to be
running, the user has to call start() before it can be used.

The advantaged of doing this is that if initialization fails and a
database is destructed before the large_data_handler is started, the
assert

database::stop() {
    assert(!_large_data_handler->running());

is not triggered.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-02-04 21:15:44 -08:00
Rafael Ávila de Espíndola
33dfe34f78 db: Move nop_large_data_handler constructor out-of-line
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-02-04 21:12:01 -08:00
Rafael Ávila de Espíndola
e99a225f25 db: Move large_data_handler::stop out-of-line
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-02-04 21:11:49 -08:00
Rafael Ávila de Espíndola
bd560e5520 types: Fix dynamic types of some data_value objects
I found these mismatched types while converting some member functions
to standalone functions, since they have to use the public API that
has more type checks.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20191120181213.111758-4-espindola@scylladb.com>
2019-11-21 12:08:46 +02:00
Juliana Oliveira
fd83f61556 Add a warning for partitions with too many rows
This patch adds a warning option to the user for situations where
rows count may get bigger than initially designed. Through the
warning, users can be aware of possible data modeling problems.

The threshold is initially set to '100,000'.

Tests: unit (dev)

Message-Id: <20190528075612.GA24671@shenzou.localdomain>
2019-06-06 19:48:57 +03:00
Rafael Ávila de Espíndola
8d9baf9843 large_data_handler: Make a variable non static
The value computed is not static since
f254664fe6, but unfortunately that was
missed in that commit.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-03-20 09:31:21 -07:00
Rafael Ávila de Espíndola
e7749e7aee large_data_handler: Fix a use after destruction
The path leading to the issue was:

The sstable name is allocated and passed to maybe_delete_large_data_entries by reference

   auto name = sst->get_filename();
   return large_data_handler.maybe_delete_large_data_entries(*sst->get_schema(), name, sst->data_size());

A future is created with a reference to it

  large_partitions = with_sem([&s, &filename, this] {
     return delete_large_data_entries(s, filename, db::system_keyspace::LARGE_PARTITIONS);
  });

The semaphore blocks.

The filename is destroyed.

delete_large_data_entries is called with a destroyed filename.

The reason this did not reproduce trivially in a debug build was that
the sstable itself was in the stack and the destructed value was read
as an internal value, and so asan had nothing to complain about.

Unfortunately we also had no tests that the entry in
system.large_rows was actually deleted.

This patch passes the name by value. It might create up to 3 copies of
it. If that is too inefficient it can probably be avoided with a
do_with in maybe_delete_large_data_entries.

Fixes #4335

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-03-20 09:30:42 -07:00
Rafael Ávila de Espíndola
63251b66c1 db: Record large cells
Fixes #4234.

Large cells are now recorded in system.large_cells.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-03-12 13:19:04 -07:00
Rafael Ávila de Espíndola
8b4ae95168 large_data_handler: Run large data recording in parallel
With this changes the futures returned by large_data_handler will not
normally wait for entries to be written to system.large_rows or
system.large_partitions.

We use a semaphore to bound how behind system.large_* table updates
can get.

This should avoid delaying sstables writes in the common case, which
is more relevant once we warn of large cells since the the default
threshold will be just 1MB.

Note that there is no ordering between the various maybe_record_* and
maybe_delete_large_data_entries requests. This means that we can end
up with a stale entry that is only removed once the TTL expires.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-03-12 13:19:04 -07:00
Rafael Ávila de Espíndola
989ab33507 large_data_handler: Remove const from a few functions
These will use a member semaphore variable in a followup patch, so they
cannot be const.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-03-12 13:19:04 -07:00
Rafael Ávila de Espíndola
5fcb3ff2d7 db: don't use _stopped directly
This gives flexibility in how it is implemented.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-03-12 13:19:04 -07:00
Rafael Ávila de Espíndola
f3089bf3d1 db: refactor a try_record helper
We had almost identical error handling for large_partitions and
large_rows. Refactor in preparation for large_cells.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-03-12 13:19:02 -07:00
Rafael Ávila de Espíndola
d7f263d334 db: Rename (maybe_)?update_large_partitions
This renames it to record_large_partitions, which matches
record_large_rows. It also changes the signature to be closer to
record_large_rows.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-03-12 13:16:04 -07:00
Rafael Ávila de Espíndola
f254664fe6 db: refactor large data deletion code
The code for deleting entries from system.large_partitions was almost
a duplicate from the code for deleting entries from system.large_rows.

This patch unifies the two, which also improves the error message when
we fail to delete entries from system.large_partitions.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-03-12 13:16:04 -07:00
Rafael Ávila de Espíndola
16ed9a2574 db: stop the commit log after the tables during shutdown
This allows for system.large_partitions to be updated if a large
partition is found while writing the last sstables.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-03-05 18:04:51 -08:00
Rafael Ávila de Espíndola
25f81cf3e3 Populate system.large_rows.
It now records large rows when they are first written to an sstable
and removes them when the sstable is deleted.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-02-26 15:56:42 -08:00
Rafael Ávila de Espíndola
da4c0da78a Extract a key_to_str helper
It will be used in more places in a followup patch.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-02-26 15:46:21 -08:00
Rafael Ávila de Espíndola
0c401f56f8 Add a delete_large_rows_entries method to large_data_handler
This will be responsible for removing large rows from
system.large_rows.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-02-26 15:46:21 -08:00
Rafael Ávila de Espíndola
81a21ea425 db::large_data_handler::(maybe_)?record_large_rows: Return future<> instead of void
These functions will record into tables in a followup patch, so they
will need to return a future.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-02-26 15:46:21 -08:00
Rafael Ávila de Espíndola
e9a13aff90 Rename log_large_row to record_large_rows
It will also record into a table in a followup patch.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-02-26 15:46:21 -08:00
Benny Halevy
13ffda5c31 database: maybe_delete_large_partitions_entry: do not access sstable and do not mask exceptions
1. We would like to be able to call maybe_delete_large_partitions_entry
from the sstable destructor path in the future so the sstable might go away
while the large data entries are being deleted.

2. We would like the caller to handle any exception on this path,
especially in the prepatation part, before calling delete_large_partitions_entry().

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-22 10:44:02 +02:00
Rafael Ávila de Espíndola
9cd14f2602 Don't write to system.large_partition during shutdown
The included testcase used to crash because during database::stop() we
would try to update system.large_partition.

There doesn't seem to be an order we can stop the existing services in
cql_test_env that makes this possible.

This patch then adds another step when shutting down a database: first
stop updating system.large_partition.

This means that during shutdown any memtable flush, compaction or
sstable deletion will not be reflected in system.large_partition. This
is hopefully not too bad since the data in the table is TTLed.

This seems to impact only tests, since main.cc calls _exit directly.

Tests: unit (release,debug)

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190213194851.117692-1-espindola@scylladb.com>
2019-02-15 10:49:10 +01:00
Rafael Ávila de Espíndola
625080b414 Rename large_partition_handler
Now that it also handles large rows, rename it to large_data_handler.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-01-28 15:03:14 -08:00