Commit Graph

18077 Commits

Author SHA1 Message Date
Piotr Sarna
09eb0429ce tests: add test for hiding virtual columns from WRITETIME
Visibility checks for virtual columns' WRITETIME and TTL
are added.
2019-02-27 15:08:16 +01:00
Piotr Sarna
af39787bf0 cql3: hide virtual columns from WRITETIME() and TTL()
Virtual columns should not be visible to the user,
so they are now hidden not only from directly selecting them,
but also via WRITETIME() and TTL() keywords.

Fixes #4288
2019-02-27 15:08:15 +01:00
Piotr Sarna
b0ab4c28cf schema: add column_definition::is_hidden_from_cql
Right now the only columns hidden from CQL are view virtual columns,
but in case of expanding this set, a helper function is provided.
2019-02-27 15:07:54 +01:00
Pekka Enberg
ca288189a9 dist/ami: Support different products for the AMI
Let's add a PRODUCT variable, similar to build_rpm.sh, for example, so
that we can override package names for enterprise AMIs.

Message-Id: <20190225063319.19516-1-penberg@scylladb.com>
2019-02-25 11:17:44 +02:00
Avi Kivity
a0b0db7915 Merge "Fix regression in perf_fast_forward results" from Paweł
"
After adcb3ec20c ("row_cache: read is not
single-partition if inter-partition forwarding is enabled") we have
noticed a regression in the results of some perf_fast_forward tests.
This was caused by those tests not disabling partition-level
fast-forwarding even though it was not needed and the commit in question
fixed an incorrect optimisation in such cases.

However, after solving that issue it has also become apparent that
mutation_reader_merger performs worse when the fast-forwarding is
disabled. This was attributed to logic responsible for dropping readers
as soon as they have reached the end of stream (which cannot be done if
fast-forwarding is enabled). This problem was mitigated with avoiding a
scan of the list and removing readers in small batches.

Fixes #4246.
Fixes #4254.

Tests: unit(dev)
"

* tag 'perf_fast_forward-fix-regression/v1' of https://github.com/pdziepak/scylla:
  mutation_reader_merger: drop unneded readers in small batches
  mutation_reader_merger: track readers by iterators and not pointers
  tests/perf_fast_forward: disable partition-level fast-forwarding if not needed
2019-02-24 19:24:00 +02:00
Avi Kivity
e3c53ff3ff Update seastar submodule
* seastar 2313dec...ab54765 (10):
  > Fix C++-17-only uses of static_assert() with a single parameter.
  > README.md: fix out-of-date explanation of C++ dialect
  > net: fix tcp load balancer accounting leak while moving socket to other shard
  > Revert "deleter: prevent early memory free caused by deleter append."
  > deleter: prevent early memory free caused by deleter append.
  > Solve seastar.unit.thread failure in debug mode
  > Fix iovec-based read_dma: use make_readv_iocb instead of make_read_iocb
  > build: Fix the required version of `fmt`
  > app_template: fix use after move in app constructor
  > build: Rename CMake variable for private flags

Fixes #4269.
2019-02-24 16:06:23 +02:00
Avi Kivity
a3a7bea12f Merge "Clean up preprocessor definitions" from Jesse
* 'jhk/define_debug/v1' of https://github.com/hakuch/scylla:
  build: Remove the `DEBUG_SHARED_PTR` pp variable
  build: Prefer the Seastar version of a pp variable
2019-02-23 14:04:08 +02:00
Jesse Haber-Kucharsky
f9297895c1 auth: Change the log level for async. retries
The log message is benign, but it has caused some users of Scylla to
think that an error has occurred.

Fixes #3850

Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com>
Message-Id: <ba49c38266c0e77c3ed23cfca3c1a082b3060f17.1550777586.git.jhaberku@scylladb.com>
2019-02-23 14:03:16 +02:00
Tomasz Grabiec
3f698701c2 gdb: Drop incorrect throw of StopIteration
It is converted into a RuntimeError by python3:

  https://docs.python.org/3/library/exceptions.html#StopIteration

We should just return.

Message-Id: <20190221144321.18093-1-tgrabiec@scylladb.com>
2019-02-23 14:02:47 +02:00
Nadav Har'El
0eddf19432 main: add INFO log messages at start, initialization end, and end.
Scylla currently prints a welcome message when it starts, with the
Scylla version, but this is not printed to the regular log so in some
cases (e.g., Jenkins runs) we do not see it in the log. So let's add
a regular INFO-level log message with the same information.

Also, Scylla currently doesn't print any specific log message when it
normally completes its shutdown. In some cases, users may end up
wondering whether Scylla hung in the middle of the shutdown, or in
fact exited normally. Refs #4238. So in this patch we add a "shutdown
complete" message as the very last message in a successfull shutdown.
We print Scylla's version also in the shutdown message, which may be
useful to see in the logs when shutting down one version of Scylla
and starting a different version.

Finally, we also add a log message when initialization is complete,
which may also be useful to understand whether Scylla hung during
initialization.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190217140659.19512-1-nyh@scylladb.com>
2019-02-22 16:52:31 +01:00
Tomasz Grabiec
b90cb91468 gdb: Introduce 'scylla cache'
Prints contents of the row cache for each table on current shard.
Message-Id: <20190222144420.19677-1-tgrabiec@scylladb.com>
2019-02-22 14:58:58 +00:00
Paweł Dziepak
b524f96a74 mutation_reader_merger: drop unneded readers in small batches
It was observed that destroying readers as soon as they are not needed
negatively affects performance of relatively small reads. We don't want
to keep them alive for too long either, since they may own a lot of
memory, but deferring the destruction slightly and removing them in
batches of 4 seems to solve the problem for the small reads.
2019-02-22 14:43:38 +00:00
Paweł Dziepak
435e24f509 mutation_reader_merger: track readers by iterators and not pointers
mutation_reader_merger uses a std::list of mutation_reader to keep them
alive while the rest of the logic operates on non-owning pointers.

This means that when it is a time to drop some of the readers that are
no longer needed, the merger needs to scan the list looking for them.
That's not ideal.

The solution is to make the logic use iterators to elements in that
list, which allows for O(1) removal of an unneeded reader. Iterators to
list are just pointers to the node and are not invalidated by unrelated
additions and removals.
2019-02-22 14:33:10 +00:00
Paweł Dziepak
5d5777f85e tests/perf_fast_forward: disable partition-level fast-forwarding if not needed
Several of the test cases in perf_fast_forward do not need
partition-level fast-forwarding. However, since the defaults are used to
construct most of the readers the fast-forwarding is enabled regardless.

This showed an apparent regression in the perf_fast_forward results
after adcb3ec20c ("row_cache: read is not
single-partition if inter-partition forwarding is enabled") which
disabled an optimisation that was invalid when partition-level
fast-forwarind was requested.

This patch ensures that all single-partition reads that do not need
partition-level fast-forwarding keep it disabled.
2019-02-22 14:28:02 +00:00
Avi Kivity
fdefee696e Merge "sstables: mc: writer: Avoid large allocations for keeping promoted index entries" from Tomasz
"
Currently we keep the entries in a circular_buffer, which uses
a contiguous storage. For large partitions with many promoted index
entries this can cause OOM and sstable compaction failure.

A similar problem exists for the offset vector built
in write_promoted_index().

This change solves the problem by serializing promoted index entries
and the offset vector on the fly directly into a bytes_ostream, which
uses fragmented storage.

The serialization of the first entry is deferred, so that
serialization is avoided if there will be less than 2
entries. Promoted index is not added for such partitions.

There still remains a problem that large-enough promoted index can cause OOM.

Refs #4217

Tests:
  - unit (release)
  - scylla-bench write

Branches: 3.0
"

* tag 'fix-large-alloc-for-promoted-index-v3' of github.com:tgrabiec/scylla:
  sstables: mc: writer: Avoid large allocations for maintaining promoted index
  sstables: mc: writer: Avoid double-serialization of the promoted index
2019-02-22 15:44:51 +02:00
Avi Kivity
177159da75 Merge "delete_atomically recovery" from Benny
"
The delete_atomically function is required to delete a set of sstables
atomically. I.e. Either delete all or none of them.  Deleting only
some sstables in the set might result in data resurrection in case
sstable A holding tombstone that cover mutation in sstable B, is deleted,
while sstable B remains.

This patchset introduces a log file holding a list of SSTable TOC files
to delete for recovering a partial delete_atomically operation.

A new subdirectory is create in the sstables dir called `pending_delete`
holding in-flight logs.

The logs are created with a temporary name (using a .tmp suffix)
and renamed to the final .log name once ready.  This indicates
the commit point for the operation.

When populating the column family, all files in the pending_delete
sub-directory are examined.  Temporary log files are just removed,
and committed log files are read, replayed, and deleted.

Fixes #4082

Tests: unit (dev), database_test (debug)
"

* 'projects/delete_atomically_recovery/v5' of https://github.com/bhalevy/scylla:
  tests: database_test: add test_distributed_loader_with_pending_delete
  distributed_loader: replay and cleanup pending_delete log files
  distributed_loader: populated_column_family: separate temp sst dirs cleanup phase
  docs: add sstables-directory-structure.md
  sstables: commit sstables to delete_atomically into a pending_delete log file
  sstables: delete_atomically: delete sstables in a thread
  sstables: component_basename: reuse with sstring component
  sstables: introduce component_basename
  database: maybe_delete_large_partitions_entry: do not access sstable and do not mask exceptions
  sstables: add delete_sstable_and_maybe_large_data_entries
  sstables: call remove_by_toc_name in dtor if marked_for_deletion
2019-02-22 15:37:17 +02:00
Benny Halevy
1ba88b709f tests: database_test: add test_distributed_loader_with_pending_delete
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-22 11:08:22 +02:00
Benny Halevy
043673b236 distributed_loader: replay and cleanup pending_delete log files
Scan the table's pending_delete sub-directory if it exists.
Remove any temporary pending_delete log files to roll back the respective
delete_atomically operation.
Replay completed pending_delete log files to roll forward the respective
delete_atomically operation, and finally delete the log files.

Cleanup of temporary sstable directories and pending_delete
sstables are done in a preliminary scan phase when populating the column family
so that we won't attempt to load the to-be-deleted sstables.

Fixes #4082

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-22 11:08:22 +02:00
Benny Halevy
ee3ad75492 distributed_loader: populated_column_family: separate temp sst dirs cleanup phase
In preparation for replaying pending_delete log files,
we would like to first remove any temporary sst dirs
and later handle pending_delete log files, and only
then populate the column family.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-22 11:08:22 +02:00
Benny Halevy
f35e4cbac7 docs: add sstables-directory-structure.md
Refs #4184

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-22 11:08:22 +02:00
Benny Halevy
024d0a6d49 sstables: commit sstables to delete_atomically into a pending_delete log file
To facilitate recovery of a delete_atomically operation that crashed mid
way, add a replayable log file holding the committed sstables to delete.

It will be used by populate_column_family to replay the atomic deletion.

1. Write the toc names of sstables to be deleted into a temporary file.
2. Once flushed and closed, rename the temp log file into the final name
   and flush the pending_delete directory.
3. delete the sstables.
4. Remove the pending_delete log file
   and flush the pending_delete directory.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-22 11:05:37 +02:00
Benny Halevy
70fda0eda0 sstables: delete_atomically: delete sstables in a thread
In prepaton for implementing a pending_delete log file.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-22 11:05:37 +02:00
Benny Halevy
9ac04850a0 sstables: component_basename: reuse with sstring component
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-22 11:05:10 +02:00
Benny Halevy
a2a9750074 sstables: introduce component_basename
component_basename returns just the basename for the component filename
without the leading sstdir path.

To be used for delete_atomically's pending_delete log file.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-22 10:44:02 +02:00
Benny Halevy
13ffda5c31 database: maybe_delete_large_partitions_entry: do not access sstable and do not mask exceptions
1. We would like to be able to call maybe_delete_large_partitions_entry
from the sstable destructor path in the future so the sstable might go away
while the large data entries are being deleted.

2. We would like the caller to handle any exception on this path,
especially in the prepatation part, before calling delete_large_partitions_entry().

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-22 10:44:02 +02:00
Benny Halevy
ae29db8db6 sstables: add delete_sstable_and_maybe_large_data_entries
To be called by delete_atomically,
rather that passing a vector to delete_sstables.

This way, no need to build `sstables_to_delete_atomically` vector

To be replaced in the future with a sstable method once we
provide the large_data_handler upon construction.

Handle exceptions from remove_by_toc_name or maybe_delete_large_partitions_entry
by merely logging an error.  There is nothing else we can do at this point.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-22 10:44:02 +02:00
Benny Halevy
387f14a874 sstables: call remove_by_toc_name in dtor if marked_for_deletion
No need to call delete_sstables which works on a list of sstable
(by toc name).

Also, add FIXME comment about not calling
large_data_handler.maybe_delete_large_partitions_entry
on this path.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-02-22 10:44:02 +02:00
Avi Kivity
34b254381f sstables: checksummed_file_writer: fix dma alignment
checksummed_file_writer does not override allocate_buffer(), so it inherits
data_source_impl's default allocate_buffer, which does not care about alignment.
The buffer is then passed to the real file_data_sink_impl, and thence to the file
itself, which cannot complete the write since it is not properly aligned.

This doesn't fail in release mode, since the Seastar allocator will supply a
properly aligned buffer even if not asked to do so. The ASAN allocator usually
does supply an aligned buffer, but not always, which causes the test to fail.

Fix by forwarding the allocate_buffer() function to the underlying data_source.

Fixes #4262.
Branches: branch-3.0
Message-Id: <20190221184115.6695-1-avi@scylladb.com>
2019-02-21 21:26:56 +01:00
Jesse Haber-Kucharsky
b7b50392ed build: Remove the DEBUG_SHARED_PTR pp variable
This definition is exported by Seastar as `SEASTAR_DEBUG_SHARED_PTR` and
no code in Scylla uses this definition either way.
2019-02-21 10:45:09 -05:00
Jesse Haber-Kucharsky
f4883a1aea build: Prefer the Seastar version of a pp variable
Seastar defines `SEASTAR_DEFAULT_ALLOCATOR`, and everywhere else in
Scylla we use this variable too.
2019-02-21 10:41:42 -05:00
Piotr Sarna
c743617236 cql3: unify max value for row limit and per-partition limit
Limits are stored as uint32_t everywhere, but in some places
int32_t was used, which created inconsistencies when comparing
the value to std::numeric_limits<Type>::max().
In order to solve inconsistencies, the types are unified to uint32_t,
and instead of explicitly calling numeric limit max,
an already existing constant value query::max_rows is utilized.

Fixes #4253

Message-Id: <4234712ff61a0391821acaba63455a34844e489b.1550683120.git.sarna@scylladb.com>
2019-02-21 13:56:02 +02:00
Tomasz Grabiec
ecff716f40 query-result-set: Give more context on failure
We've seen schema application failing with marshal_exception
here. That's not enough information to figure out what is the
problem. Knowing which table and column is affected would make
diagnosis much easier in certain cases.

This patch wraps errors in query::deserialization_error with more
information.

Example output:

  query::deserialization_error (failed on column system_schema.tables#bloom_filter_fp_chance \
  (version: c179c1d7-9503-3f66-a5b3-70e72af3392a, id: 0, index: 0, type: org.apache.cassandra.db.marshal.DoubleType):\
  seastar::internal::backtraced<marshal_exception> (marshaling error: read_simple - not enough bytes (expected 8, got 3)
Message-Id: <20190221113219.13018-1-tgrabiec@scylladb.com>
2019-02-21 11:35:27 +00:00
Nadav Har'El
f55bdea364 compaction manager: avoid spurious "asked to stop" message at the end of the log
This patch removes the log message about "compaction_manager - Asked to stop"
at the very end of Scylla runs. This log message is confusing because it
only has the "asked to stop" part, without finally a "stopped", and may
lead a user to incorrectly fear that the shutdown hung - when it in fact
finished just fine.

The database object holds a compaction_manager and stop()s it when the
database is stop()ed - and that is the very last thing our shutdown does.
However, much earlier, as the *first* shutdown operation (i.e., the last
at_exit() in main.cc), we already stop() the compaction manager.

The second stop() call does nothing, but unfortunately prints the log
message just before checking if it has anything to stop. So this patch
just moves the log message to after the check.

Fixes #4238.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190217142657.19963-1-nyh@scylladb.com>
2019-02-21 12:32:47 +01:00
Rafael Ávila de Espíndola
5a7bff36ca Simplify sstable::filename
No functionality change, but avoids a std::unordered_map.

Tests: unit (dev)

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190221014630.15476-1-espindola@scylladb.com>
2019-02-21 12:40:01 +02:00
Avi Kivity
5520fc37ba Merge " Fix INSERT JSON with null values" from Piotr
"
Fixes #4256

This miniseries fixes a problem with inserting NULL values through
INSERT JSON interface.

Tests: unit (dev)
"

* 'fix_insert_json_with_null' of https://github.com/psarna/scylla:
  tests: add test for INSERT JSON with null values
  cql3: add missing value erasing to json parser
2019-02-21 12:36:09 +02:00
Piotr Sarna
4d211690f9 tests: add test for INSERT JSON with null values 2019-02-21 11:25:14 +01:00
Piotr Sarna
6618191e49 cql3: add missing value erasing to json parser
When inserting a null value through INSERT JSON, the column
was erroneously not removed from the 'not used' list of columns.

Fixes #4256
2019-02-21 11:23:44 +01:00
Tomasz Grabiec
8687666169 schema_tables: Add trace-level logging of schema mutations
Can be useful in diagnosing problems with application of schema
mutations.

do_merge_schema() is called on every change of schema of the local
node.

create_table_from_mutations() is called on schema merge when a table
was altered or created using mutations read from local schema tables
after applying the change, or when loading schema on boot.

Message-Id: <20190221093929.8929-2-tgrabiec@scylladb.com>
2019-02-21 12:16:38 +02:00
Tomasz Grabiec
f65d1e649d schema_mutations: Make printable
Message-Id: <20190221093929.8929-1-tgrabiec@scylladb.com>
2019-02-21 12:16:32 +02:00
Avi Kivity
9adfd11374 Merge "Avoid including cryptopp headers" from Rafael
"
cryptopp's config.h has the following pragma:

 #pragma GCC diagnostic ignored "-Wunused-function"

It is not wrapped in a push/pop. Because of that, including cryptopp
headers disables that warning on scylla code too.

This patch series introduces a single .cc file that has to include
cryptopp headers.
"

* 'avoid-cryptopp-v3' of https://github.com/espindola/scylla:
  Avoid including cryptopp headers
  Delete dead code
2019-02-21 10:31:20 +02:00
Rafael Ávila de Espíndola
fd5ea2df5a Avoid including cryptopp headers
cryptopp's config.h has the following pragma:

 #pragma GCC diagnostic ignored "-Wunused-function"

It is not wrapped in a push/pop. Because of that, including cryptopp
headers disables that warning on scylla code too.

The issue has been reported as
https://github.com/weidai11/cryptopp/issues/793

To work around it, this patch uses a pimpl to have a single .cc file
that has to include cryptopp headers.

While at it, it also reduces the differences and code duplication
between the md5 and sha1 hashers.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-02-20 08:03:46 -08:00
Rafael Ávila de Espíndola
a309f952d2 Delete dead code
This code would have be to refactored by the next patch. Since it is
commented out, just delete it.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-02-20 08:03:46 -08:00
Duarte Nunes
4354479985 Merge 'Minimize generated view updates for unselected column updates' from Piotr
"
This series addresses the issue of redundant view updates,
generated for columns that were not selected for given materialized view.
Cases covered (quote:)
* If a base row has a live row marker, then we can avoid generating
  view updates if only unselected columns change;
* If a base row has no live row marker, then we can avoid generating
  view updates if unselected columns are updated, unless they are newly
  created, deleted, or they have a TTL.

Additionally, this series includes caching selected columns and is_index information
to avoid unnecessary CPU cycles spent on recomputing these two.

Fixes #3819
"

* 'send_less_view_updates_if_not_necessary_4' of https://github.com/psarna/scylla:
  tests: add cases for view update generation optimizations
  view: minimize generated view updates for unselected columns
  view: cache is_index for view pointer
  index: make non-pointer overload of is_index function
  index: avoid copying when checking for is_index
2019-02-20 13:24:44 +00:00
Piotr Sarna
563456e3ac tests: add cases for view update generation optimizations
Test cases that cover avoiding generating view updates
when not necessary (e.g. when a column not selected by the view
is modified) are added.
2019-02-20 14:05:29 +01:00
Piotr Sarna
bd52e05ae2 view: minimize generated view updates for unselected columns
In some cases generating view updates for columns that were not
selected in CREATE VIEW statement is redundant - it is the case
when the update will not influence row liveness in anyway.
Currently, these cases are optimized out:
 - row marker is live and only unselected columns were updated;
 - row marked is not live and only unselected columns were updated,
   and in the process nothing was created or deleted and there was
   no TTL involved;
2019-02-20 14:05:27 +01:00
Piotr Sarna
dbe8491655 view: cache is_index for view pointer
It's detrimental to keep querying index manager whether a view
is backing a secondary index every time, so this value is cached
at construct time.
At the same time, this value is not simply passed to view_info
when being created in secondary index manager, in order to
decouple materialized view logic from secondary indexes as much as
possible (the sole existence of is_index() is bad enough).
2019-02-20 12:52:32 +01:00
Piotr Sarna
cb20fc2e4f index: make non-pointer overload of is_index function
Previous interface enforced passing a shared pointer, which
might result in calling unneeded shared_from_this().
2019-02-20 12:52:32 +01:00
Piotr Sarna
94db098d39 index: avoid copying when checking for is_index
Previously is_index implementation used list_indexes() helper function,
which copies data.
2019-02-20 12:52:32 +01:00
Tomasz Grabiec
a8c74bc7ab gdb: Print LSA/Cache/Memtable memory usage from "scylla memory"
Example output:

LSA:
  allocated:     181010432
  used:          177209344
  free:            3801088

Cache:
  total:          97255424
  used:           60700600
  free:           36554824

Memtables:
 total:            83755008
 Regular:
  real dirty:      79429632
  virt dirty:      35168426
 System:
  real dirty:        524288
  virt dirty:        466764
 Streaming:
  real dirty:             0
  virt dirty:             0

Message-Id: <1550598424-23428-1-git-send-email-tgrabiec@scylladb.com>
2019-02-20 12:53:53 +02:00
Tomasz Grabiec
dafe22dd83 lsa: Fix spurios abort with --enable-abort-on-lsa-bad-alloc
allocate_segment() can fail even though we're not out of memory, when
it's invoked inside an allocating section with the cache region
locked. That section may later succeed after retried after memory
reclamation.

We should ignore bad_alloc thrown inside allocating section body and
fail only when the whole section fails.

Fixes #2924

Message-Id: <1550597493-22500-1-git-send-email-tgrabiec@scylladb.com>
2019-02-20 12:53:49 +02:00