test_fast_forwarding_across_partitions_to_empty_range uses an uninitialized
string to populate an sstable, but this can be invalid utf-8 so that sstable
cannot be sstabledumped.
Make it valid by using make_random_string().
Fixes#4040.
Message-Id: <20190107193240.14409-1-avi@scylladb.com>
In c++17 there are standard ways of requesting aligned memory, so
seastar doesn't need to provide one.
This patch is in preparation for removing with_alignment from seastar.
Tests: unit (debug)
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190107191019.22295-1-espindola@scylladb.com>
The current timeout is way too small for debug builds. Currently
jenkins runs avoid the problem by increasing the timeout by 100x. This
patch increases it by 10x, with seems to be sufficient to run the
tests in most desktop machines.
Message-Id: <20190107191413.22531-1-espindola@scylladb.com>
Non-privileged user may not belongs to "wheel" group, for example Debian
variants uses "sudo" group instead of "wheel".
To make sudo able to work on all environment we should allow sudo for
"ALL" instead of "wheel".
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190107173410.23140-1-syuu@scylladb.com>
When building something other than Scylla (like scylla-tools-java or scylla-jmx)
it is convenient to run it from some other directory. To do that, allow running
dbuild from any directory (so we locate tools/toolchain/image relative to the
dbuild script rather than use a fixed path) and mount the current directory
since it's likely the user will want to access files there.
Message-Id: <20190107165824.25164-1-avi@scylladb.com>
Start a new document with an overview of isolation in Scylla, i.e.,
scheduling groups, I/O priority classes, controllers, etc.
As all documents in docs/, this is a document for developers (not users!)
who need to understand how isolation between different pieces of Scylla
(e.g., queries, compaction, repair, etc.) works, which scheduling groups
and I/O classes we have and why, etc.
The document is still very partial and includes a lot of TODOs on
places where the explanation needs to be expanded. In particular it
needs an accurate explanation (and not just a name) of what kind of
work is done under each of the groups and classes, and an explanation
of how we set up RPC to use which scheduling groups for the code it
executes.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190103183232.21348-1-nyh@scylladb.com>
While we keep ordinary hints in a directory parallel to the data directory,
we decided to keep the materialized view hints in a subdirectory of the data
directory, named "view_pending_updates". But during boot, we expect all
subdirectories of data/ to be keyspace names, and when we notice this one,
we print a warning:
WARN: database - Skipping undefined keyspace: view_pending_updates
This spurious warning annoyed users. But moreover, we could have bigger
problems if the user actually tries to create a keyspace with that name.
So in this patch, we move the view hints to a separate top-level directory,
which defaults to /var/lib/scylla/view_hints, but as usual can be configured.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190107142257.16342-1-nyh@scylladb.com>
"Add features that are useful for continuous integration pipelines (and
also ordinary developers):
- sudo support, with and without a tty, as our packaging scripts require it
- install ccache package to allow reducing incremental build times
- dependencies needed to build scylla-jmx and scylla-tools-java"
* tag 'toolchain-ci/v1' of https://github.com/avikivity/scylla:
tools: toolchain: update image for ant, maven, ccache, sudo
tools: toolchain: dbuild: pass-through supplementary groups
tools: toolchain: defeat PAM
tools: toolchain: improve sudo support
tools: toolchain: break long line in dbuild
tools: toolchain: prepare sudoers file
tools: toolchain: install ccache
install-dependencies.sh: add maven and ant
"This patchset reduces inclusions of database.hh, particularly in header
files. It reduces the number of objects depending on database.hh from 166
to 116.
Tests: unit(release), playing a little with tracing"
* tag 'database.hh/v1' of https://github.com/avikivity/scylla:
streaming: stream_session: remove include of db/view/view_update_from_staging_generator.hh
sstables: writer.hh: add some forward declarations
table_helper: remove database.hh include
table_helper: de-inline insert() and setup_keyspace()
table_helper: de-template setup_keyspace()
table_helper: simplify template body of table_helper::insert()
schema_tables: remove #include of database.hh
cql_type_parser: remove dependency on user_types_metadata
thrift: add missing include of sleep.hh
cql3: ks_prop_defs: remove #include "database.hh"
To be able to verify the golden version with sstabledump.
These files were generated by running sstable_3_x_test and keeping its
generated output files.
Refs #4043
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20190103112511.23488-2-bhalevy@scylladb.com>
Various small improvements to docs/logging.md:
1. Describe the options to log to stdout or syslog and their defaults.
2. Mention the possibility of using nodetool instead of REST API.
3. Additional small tweaks to formatting.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190106111851.26700-1-nyh@scylladb.com>
Add a new document about logging in Scylla, and how to change the log levels
when running Scylla and during the run.
It needs more developer-oriented information (e.g., how to create new logger
subsystems in the code) but I think it's a good start.
Some of the text is based on Glauber's writeup for the Scylla website on
changing log levels at runtime.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190106103606.26032-1-nyh@scylladb.com>
This header, which is easily replaced with a forward declaration,
introduces a dependency on database.hh everywhere. Remove it and scatter
includes of database.hh in source files that really need it.
Move most of the body into a non-template overload to reduce dependencies
in the header (and template bloat). The function is not on any fast path,
and noncopyable_function will likely not even allocate anything.
A default parameter of type T (or lw_shared_ptr<T>) requires that T be
defined. Remove the depndency by redefining the default parameter
as an overload, for T = user_types_metadata.
The workload in #3844 has these characteristics:
- very small data set size (a few gigabytes per shard)
- large working set size (all the data, enough for high cache miss rate)
- high overwrite rate (so a compaction results in 12X data reduction)
As a result, the compaction backlog controller assigns very few shares to
compaction (low data set size -> low backlog), so compaction proceeds very slowly.
Meanwhile, we have tons of cache misses, and each cache miss needs to read from a
large number of sstables (since compaction isn't progressing). The end result is
a high read amplification, and in this test, timeouts.
While we could declare that the scenario is very artificial, there are other
real-world scenarios that could trigger it. Consider a 100% write load
(population phase) followed by 100% read. Towards the end of the last compaction,
the backlog will drop more and more until compaction slows to a crawl, and until
it completes, all the data (for that compaction) will have to be read from its
input sstables, resulting in read amplification.
We should probably have read amplification affect the backlog, but for now the
simpler solution is to increase the minimum shares to 50 so that compaction
always makes forward progress. This will result in higher-than-needed compaction
bandwidth in some low write rate scenarios so we will see fluctuations in request
rate (what the controller was designed to avoid), but these fluctioations will be
limited to 5%.
Since the base class backlog_controller has a fixed (0, 0) point, remove it
and add it to derived classes (setting it to (0, 50) for compaction).
Fixes#3844 (or at least improves it).
Message-Id: <20181231162710.29410-1-avi@scylladb.com>
Prevent PAM from enforcing security and preventing sudo from working. This is
done by replacing the default configuration (designed for workstations) to
one that uses pam_permit for everything.
Add tools needed to build scylla-jmx and scylla-tools-java. While
not requirements of this repository, it's nicer if a single setup
can be used to build and run everything.
We also install pystache as it's used by packaging scripts.
The current implementation breaks the invariant that
_size_bytes = reduce(_fragments, &temporary_buffer::size)
In particular, this breaks algorithms that check the individual
segment size.
Correctly implement remove_suffix() by destroying superfluous
temporary_buffer's and by trimming the last one, if needed.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20190103133523.34937-1-duarte@scylladb.com>
Simplify the fix for memory based eviction, introduced by 918d255 so
there is no need to massage the counters.
Also add a check to `test_memory_based_cache_eviction` which checks for
the bug fixed. While at it also add a check to
`test_time_based_cache_eviction` for the fix to time based eviction
(e5a0ea3).
Tests: tests/querier_cache:debug
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <c89e2788a88c2a701a2c39f377328e77ac01e3ef.1546515465.git.bdenes@scylladb.com>
"
This series adds staging SSTables support to row level repair.
It was introduced for streaming sessions before, but since row level
repair doesn't leverage sessions at all, it's added separately.
Tests:
unit (release)
dtest (repair_additional_test.py:RepairAdditionalTest,
excluding repair_abort_test, which fails for me locally on master)
"
* 'add_staging_sstables_generation_to_row_level_repair_2' of https://github.com/psarna/scylla:
repair: add staging sstables support to row level repair
main,repair: add params to row level repair init
streaming,view: move view update checks to separate file
When a node is bootstrapping, it will receive data from other nodes
via streaming, including materialized views. Regardless whether these
views are built on other nodes or not, building them on newly
bootstrapped nodes has no effect - updates were either already streamed
completely (if view building have finished) or will be propagated
via view building, if the process is still ongoing.
So, marking all views as 'built' for the bootstrapped node prevents it
from spawning superfluous view building processes.
Fixes#4001
Message-Id: <fd53692c38d944122d1b1013fdb0aedf517fa409.1546498861.git.sarna@scylladb.com>
Currently queriers evicted due to their TTL expiring are not
unregistered from the `reader_concurrency_semaphore`. This can cause a
use-after-free when the semaphore tries to evict the same querier at
some later point in time, as the querier entry it has a pointer to is
now invalid.
Fix by unregistering the querier from the semaphore before destroying
the entry.
Refs: #4018
Refs: #4031
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <4adfd09f5af8a12d73c29d59407a789324cd3d01.1546504034.git.bdenes@scylladb.com>