Since we cannot use dh --with=systemd because we don't want to
automatically enabling systemd units, manage them by our setup scripts,
we have to do 'systemctl daemon-reload' manually.
(On dh --with=systemd, systemd helper automatically provides such
scirpts)
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190618000210.28972-1-syuu@scylladb.com>
Running tests in debug mode takes 25:22.08 in my machine. Using
sanitize instead takes that down to 10:46.39.
The mode is opt in, in that it must be explicitly selected with
"configure.py --mode=sanitize" or "ninja sanitize". It must also be
explicitly passed to test.py.
Unfortunately building with asan, optimizations and debug info is
very slow and there is nothing like -gline-tables-only in gcc.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190617170007.44117-1-espindola@scylladb.com>
This patch set fixes repair nodes using different schema version and
optimizes the hashing thanks to the fact now all nodes uses same schema
version.
Fixes: #4549
* seastar-dev.git asias/repair_use_same_schema.v3:
repair: Use the same schema version for repair master and followers
repair: Hash column kind and id instead of column name and type name
It is guaranteed repair nodes use the same schema. It is faster to hash
column kind and id.
Changing the hashing of mutation fragment causes incompatibility with
mixed clusters. Let's backport to the 3.1 release, which includes row
level repair for the first time and is not released yet.
Refs: #4549
Backports: 3.1
Before this patch, repair master and followers use their own schema
version at the point repair starts independently. The schemas can be
different due to schema change. Repair uses the schema to serialize
mutation_fragment and deserialize the mutation_fragment received from
peer nodes. Using different schema version to serialize and deserialize
cause undefined behaviour.
To fix, we use the schema the repair master decides for all the repair
nodes involved.
On top of this patch, we could do another step to make sure all nodes
has the latest schema. But let's do it in a separate patch.
Fixes: #4549
Backports: 3.1
The intention is just to document what is currently done. If someone
wants to propose changes, that can be done after the current practices
have been documented.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190524135109.29436-1-espindola@scylladb.com>
The code that decides whether a query should used indexing was buggy - a partition key index might have influenced the decision even if the whole partition key was passed in the query (which effectively means that indexing it is not necessary).
Fixes#4539
Closes https://github.com/scylladb/scylla/pull/4544
Merged from branch 'fix_deciding_whether_a_query_uses_indexing' of git://github.com/psarna/scylla
tests: add case for partition key index and filtering
cql3: fix deciding if a query uses indexing
Currently NIC selection prompt on scylla_setup just proceed setup when
user just pressed Enter key on the prompt.
The prompt should ask NIC name again until user input correct NIC name.
Fixes#4517
Message-Id: <20190617124925.11559-1-syuu@scylladb.com>
"
These patches fix remaining issues with gcc9 build, that involve a gcc9 bug, a gcc9 bug, and a stricter warning.
Tests: unit(debug, dev, release).
"
* 'fix-gcc9-build' of https://github.com/pdziepak/scylla:
dht/ring_position: silence complaints about uninitialised _token_bound
xx_hasher: disable -Warray-bounds
api/column_family: work around gcc9 bug in seastar::future<std::any>
Currently, calling unfreeze() using the wrong version of the schema
results in undefined behavior. That can cause hard-to-debug
problems. Better to throw in such cases.
Refs #4549.
Tests:
- unit (dev)
Message-Id: <1560459022-23786-1-git-send-email-tgrabiec@scylladb.com>
In release mode gcc9 has a false positive warning about out of bound
access in xxhash implementation:
./xxHash/xxhash.c:799:27: error: array subscript -3 is outside array bounds of ‘long unsigned int [1]’ [-Werror=array-bounds]
This is solved by disabling -Warray-bounds in the xxhash code.
There is a gcc9 bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90415
that makes it impossible to pass std::any through a seastar::future<T>.
Fortunately, there is only one user of seastar::future<std::any> in
Scylla and it is not performance-critical. This patch avoids the gcc9
bug by using seastar::future<std::unique_ptr<std::any>>.
We saw a node crashing today with nodetool clearsnapshot being called.
After investigation, the reason is that nodetool clearsnapshot ws called
at the same time a new snapshot was created with the same tag. nodetool
clearsnapshot can't delete all files in the directory, because new files
had by then been created in that directory, and crashes on I/O error.
There are, many problems with allowing those operations to proceed in
parallel. Even if we fix the code not to crash and return an error on
directory non-empty, the moment they do any amount of work in parallel
the result of the operation becomes undefined. Some files in the
snapshot may have been deleted by clear, for example, and a user may
then not be able to properly restore from the backup if this snapshot
was used to generate a backup.
Moreover, although we could lock at the granularity of a keyspace or
column family, I think we should use a big hammer here and lock the
entire snapshot creation/deletion to avoid surprises (for example, if a
user requests creation of a snapshot for all keyspaces, and another
process requests clear of a single keyspace)
Fixes#4554
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20190614174438.9002-1-glauber@scylladb.com>
In practice, we always want to use the same sanitizer flags with
seastar and scylla. Seastar was already marking its sanitizer flags
public, so what was missing was exporting the link flags via pkgconfig
and dropping the duplicates from scylla.
I am doing this after wasting some time editing the wrong file.
This depends on the seastar patch to export the sanitizer flags in
pkgconfig.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
We used to use /opt/scylladb just for Scylla build toolchain and
dependency libraries, not for Scylla main package.
But since we merged relocatable package, Scylla main binary and
dependency libraries are all located under /opt/scylladb, only
setup scripts remained on /usr/lib/scylla.
It strange to keep using both /usr/lib/<app name> and /opt/<app name>,
we should merge them into single place.
Message-Id: <20190614011038.17827-1-syuu@scylladb.com>
Before this patch mc sstables writer was ignoring
empty cellpaths. This is a wrong behaviour because
it is possible to have empty key in a map. In such case,
our writer creats a wrong sstable that we can't read back.
This is becaus a complex cell expects cellpath for each
simple cell it has. When writer ignores empty cellpath
it writes nothing and instead it should write a length
of zero to the file so that we know there's an empty cellpath.
Fixes#4533
Tests: unit(release)
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <46242906c691a56a915ca5994b36baf87ee633b7.1560532790.git.piotr@scylladb.com>
Consider
master: row(pk=1, ck=1, col=10)
follower1: row(pk=1, ck=1, col=20)
follower2: row(pk=1, ck=1, col=30)
When repair runs, master fetches row(pk=1, ck=1, col=20) and row(pk=1,
ck=1, col=30) from follower1 and follower2.
Then repair master sends row(pk=1, ck=1, col=10) and row(pk=1, ck=1,
col=30) to follower1, follower1 will write the row with the same
pk=1, ck=1 twice, which violates uniqueness constraints.
To fix, we apply the row with same pk and ck into the previous row.
We only needs this on repair follower because the rows can come from
multiple nodes. While on repair master, we have a sstable writer per
follower, so the rows feed into sstable writer can come from only a
single node.
Tests: repair_additional_test.py:RepairAdditionalTest.repair_same_row_diff_value_3nodes_test
Fixes: #4510
Message-Id: <cb4fbba1e10fb0018116ffe5649c0870cda34575.1560405722.git.asias@scylladb.com>
On repair follower node, only decorated_key_with_hash and the
mutation_fragment inside repair_row are used in apply_rows() to apply
the rows to disk. Allow repair_row to initialize partially and throw if
the uninitialized member is accessed to be safe.
Message-Id: <b4e5cc050c11b1bafcf997076a3e32f20d059045.1560405722.git.asias@scylladb.com>
To provide test reproducibility use the seastar local_random_engine.
To reproduce a run, use the --random-seed command line option
with the seed printed accordingly.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20190613114307.31038-1-bhalevy@scylladb.com>
When a column is not present in the select clause, but used for
filtering, it usually needs to be fetched from replicas.
Sometimes it can be avoided, e.g. if primary key columns form a valid
prefix - then, they will be optimized out before filtering itself.
However, clustering key prefix can only be qualified for this
optimization if the whole partition key is restricted - otherwise
the clustering columns still need to be present for filtering.
This commit also fixes tests in cql_query_test suite, because they now
expect more values - columns fetched for filtering will be present as
well (only internally, the clients receive only data they asked for).
Fixes#4541
Message-Id: <f08ebae5562d570ece2bb7ee6c84e647345dfe48.1560410018.git.sarna@scylladb.com>
Relocation of python scripts mentions scylla-server in paths explicitly.
It should use {{product}} instead. The current build is failing when
{{product}} is different than scylla-server
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20190613012518.28784-1-glauber@scylladb.com>
On branch-3.1 / master, we are getting following error:
ERROR 2019-06-11 10:58:49,156 [shard 0] database - /var/lib/scylla/data: File not owned by current euid: 0. Owner is: 999
ERROR 2019-06-11 10:58:49,156 [shard 0] init - Failed owner and mode verification: std::runtime_error (File not owned by current euid: 0. Owner is: 999)
ERROR 2019-06-11 10:58:49,156 [shard 0] database - /var/lib/scylla/hints: File not owned by current euid: 0. Owner is: 999
ERROR 2019-06-11 10:58:49,156 [shard 0] init - Failed owner and mode verification: std::runtime_error (File not owned by current euid: 0. Owner is: 999)
ERROR 2019-06-11 10:58:49,156 [shard 0] database - /var/lib/scylla/commitlog: File not owned by current euid: 0. Owner is: 999
ERROR 2019-06-11 10:58:49,156 [shard 0] init - Failed owner and mode verification: std::runtime_error (File not owned by current euid: 0. Owner is: 999)
ERROR 2019-06-11 10:58:49,156 [shard 0] database - /var/lib/scylla/view_hints: File not owned by current euid: 0. Owner is: 999
ERROR 2019-06-11 10:58:49,156 [shard 0] init - Failed owner and mode verification: std::runtime_error (File not owned by current euid: 0. Owner is: 999)
It seems like owner verification of data directory fails because
scylla-server process is running in root but data directory owned by
scylla, so we should run services as scylla user.
Fixes#4536
Message-Id: <20190611113142.23599-1-syuu@scylladb.com>
To avoid 'Bad permmisons' error when user changed default umask, we need
to verify system umask is acceptable for scylla-server.
Fixes#4157
Message-Id: <20190612130343.6043-1-syuu@scylladb.com>
* seastar 253d6cb...ded50bd (14):
> Only export sanitizer flags if used
> perftune.py: use pyudev.Devices methods instead of deprecated pyudev.Device ones
> Add a Sanitize build mode
> Merge "perftune.py : new tuning modes" from Vlad
> reactor: clarify how submit_to() destroys the function object
> Export the sanitizer flags via pkgconfig
> smp: Delete unprocessed work items
> iotune: fixed finding mountpoint infinite loop
> net: Fix dereferencing moved object
> Always enable the exception scalability hack
> Merge "Simple cleanups in future.hh" from Rafael
> tests: introduce testing::local_random_engine
> core/deleter: Fix abort when append() is called twice with a shared deleter
> rpc stream: do not crash if a stream is used after eos
Currently, REPAIR_GET_COMBINED_ROW_HASH RPC verb returns only the
repair_hash object. In the future, we will use set reconciliation
algorithm to decode the full row hashes in working row buf. It is useful
to return the number of rows inside working row buf in addition to the
combined row hashes to make sure the decode is successful.
It is also better to use a wrapper class for the verb response so we can
extend the return values later more easily with IDL.
Fixes#4526
Message-Id: <93be47920b523f07179ee17e418760015a142990.1559771344.git.asias@scylladb.com>
With this patch, when using asan, we poison segment memory that has
been allocated from the system but should not be accessible to user
code.
Should help with debugging user after free bugs.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190607140313.5988-1-espindola@scylladb.com>
The code that decides whether a query should used indexing
was buggy - a partition key index might have influenced the decision
even if the whole partition key was passed in the query (which
effectively means that indexing it is not necessary).
Fixes#4539
Introduced in 513d01d53e
The script is trying to determine the branch to shallow clone
when an rpm is missing and has to be built.
This functionality in the current implementation assumes it is being run inside
a git repository, but that must not be the case if the script is triggered after
local rpms were placed on the local directory.
This happens when putting all necessary rpm files in: dist/ami/files
And then running: dist/ami/build_ami.sh --localrpm
The dist/ami/ and dist/ami/files are the only ones required for this action so
querying the git repository in that situation makes no sense.
Fixes#4535
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20190611112455.13862-1-bhalevy@scylladb.com>
Build progress virtual reader uses Scylla-specific
scylla_views_builds_in_progress table in order to represent legacy
views_builds_in_progress rows. The Scylla-specific table contains
additional cpu_id clustering key part, which is trimmed before
returning it to the user. That may cause duplicated clustering row
fragments to be emitted by the reader, which may cause undefined
behaviour in consumers. The solution is to keep track of previous
clustering keys for each partition and drop fragments that would cause
duplication. That way if any shard is still building a view, its
progress will be returned, and if many shards are still building, the
returned value will indicate the progress of a single arbitrary shard.
Fixes#4524
Tests:
unit(dev) + custom monotonicity checks from tgrabiec@scylladb.com
Build progress virtual reader uses Scylla-specific
scylla_views_builds_in_progress table in order to represent
legacy views_builds_in_progress rows. The Scylla-specific table contains
additional cpu_id clustering key part, which is trimmed before returning
it to the user. That may cause duplicated clustering row fragments to be
emitted by the reader, which may cause undefined behaviour in consumers.
The solution is to keep track of previous clustering keys for each
partition and drop fragments that would cause duplication. That way if
any shard is still building a view, its progress will be returned,
and if many shards are still building, the returned value will indicate
the progress of a single arbitrary shard.
Fixes#4524
Tests:
unit(dev) + custom monotonicity checks from <tgrabiec@scylladb.com>
All Scylla code is written with "using namespace seastar", i.e., no
"seastar::" prefix for Seastar symbols. Document this in the coding style.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190610203948.18075-1-nyh@scylladb.com>
Fixes#4525
req_param uses boost::lexical cast to convert text->var.
However, lexical_cast does not handle textual booleans,
thus param=true causes not only wrong values, but
exceptions.
Message-Id: <20190610140511.15478-1-calle@scylladb.com>
Currently, each shard protects itself by not reading from rpc and the native
transport if in-flight requests consume too much memory for that shard. However,
if all shards then forward their requests to some other shard, then that shard
can easily run out of memory since its load can be multiplied by the number of
shards that send it requests.
To protect against this, use the new Seastar smp_service_group infrastructure.
We create three groups: read, write, and write ack (the latter is needed to
avoid ABBA deadlocks is shard A exhausts all its resources sending writes to shard B,
and shard B simulateously does the same; neither will be able to send
acknowledgements, so if the writes are throttled, they will never be unthrottled
until a timeout occurs).
Range scans are not addressed by this patch since they are handled by
multishard_mutation_query, which has its own complex cross-shard communication
scheme, but it be a similar solution.
Ref #1105 (missing range scan protection)
Tests: unit (dev)
Message-Id: <20190512142243.17795-1-avi@scylladb.com>