Relocation of python scripts mentions scylla-server in paths explicitly.
It should use {{product}} instead. The current build is failing when
{{product}} is different than scylla-server
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20190613012518.28784-1-glauber@scylladb.com>
On branch-3.1 / master, we are getting following error:
ERROR 2019-06-11 10:58:49,156 [shard 0] database - /var/lib/scylla/data: File not owned by current euid: 0. Owner is: 999
ERROR 2019-06-11 10:58:49,156 [shard 0] init - Failed owner and mode verification: std::runtime_error (File not owned by current euid: 0. Owner is: 999)
ERROR 2019-06-11 10:58:49,156 [shard 0] database - /var/lib/scylla/hints: File not owned by current euid: 0. Owner is: 999
ERROR 2019-06-11 10:58:49,156 [shard 0] init - Failed owner and mode verification: std::runtime_error (File not owned by current euid: 0. Owner is: 999)
ERROR 2019-06-11 10:58:49,156 [shard 0] database - /var/lib/scylla/commitlog: File not owned by current euid: 0. Owner is: 999
ERROR 2019-06-11 10:58:49,156 [shard 0] init - Failed owner and mode verification: std::runtime_error (File not owned by current euid: 0. Owner is: 999)
ERROR 2019-06-11 10:58:49,156 [shard 0] database - /var/lib/scylla/view_hints: File not owned by current euid: 0. Owner is: 999
ERROR 2019-06-11 10:58:49,156 [shard 0] init - Failed owner and mode verification: std::runtime_error (File not owned by current euid: 0. Owner is: 999)
It seems like owner verification of data directory fails because
scylla-server process is running in root but data directory owned by
scylla, so we should run services as scylla user.
Fixes#4536
Message-Id: <20190611113142.23599-1-syuu@scylladb.com>
To avoid 'Bad permmisons' error when user changed default umask, we need
to verify system umask is acceptable for scylla-server.
Fixes#4157
Message-Id: <20190612130343.6043-1-syuu@scylladb.com>
* seastar 253d6cb...ded50bd (14):
> Only export sanitizer flags if used
> perftune.py: use pyudev.Devices methods instead of deprecated pyudev.Device ones
> Add a Sanitize build mode
> Merge "perftune.py : new tuning modes" from Vlad
> reactor: clarify how submit_to() destroys the function object
> Export the sanitizer flags via pkgconfig
> smp: Delete unprocessed work items
> iotune: fixed finding mountpoint infinite loop
> net: Fix dereferencing moved object
> Always enable the exception scalability hack
> Merge "Simple cleanups in future.hh" from Rafael
> tests: introduce testing::local_random_engine
> core/deleter: Fix abort when append() is called twice with a shared deleter
> rpc stream: do not crash if a stream is used after eos
Currently, REPAIR_GET_COMBINED_ROW_HASH RPC verb returns only the
repair_hash object. In the future, we will use set reconciliation
algorithm to decode the full row hashes in working row buf. It is useful
to return the number of rows inside working row buf in addition to the
combined row hashes to make sure the decode is successful.
It is also better to use a wrapper class for the verb response so we can
extend the return values later more easily with IDL.
Fixes#4526
Message-Id: <93be47920b523f07179ee17e418760015a142990.1559771344.git.asias@scylladb.com>
With this patch, when using asan, we poison segment memory that has
been allocated from the system but should not be accessible to user
code.
Should help with debugging user after free bugs.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190607140313.5988-1-espindola@scylladb.com>
Introduced in 513d01d53e
The script is trying to determine the branch to shallow clone
when an rpm is missing and has to be built.
This functionality in the current implementation assumes it is being run inside
a git repository, but that must not be the case if the script is triggered after
local rpms were placed on the local directory.
This happens when putting all necessary rpm files in: dist/ami/files
And then running: dist/ami/build_ami.sh --localrpm
The dist/ami/ and dist/ami/files are the only ones required for this action so
querying the git repository in that situation makes no sense.
Fixes#4535
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20190611112455.13862-1-bhalevy@scylladb.com>
Build progress virtual reader uses Scylla-specific
scylla_views_builds_in_progress table in order to represent legacy
views_builds_in_progress rows. The Scylla-specific table contains
additional cpu_id clustering key part, which is trimmed before
returning it to the user. That may cause duplicated clustering row
fragments to be emitted by the reader, which may cause undefined
behaviour in consumers. The solution is to keep track of previous
clustering keys for each partition and drop fragments that would cause
duplication. That way if any shard is still building a view, its
progress will be returned, and if many shards are still building, the
returned value will indicate the progress of a single arbitrary shard.
Fixes#4524
Tests:
unit(dev) + custom monotonicity checks from tgrabiec@scylladb.com
Build progress virtual reader uses Scylla-specific
scylla_views_builds_in_progress table in order to represent
legacy views_builds_in_progress rows. The Scylla-specific table contains
additional cpu_id clustering key part, which is trimmed before returning
it to the user. That may cause duplicated clustering row fragments to be
emitted by the reader, which may cause undefined behaviour in consumers.
The solution is to keep track of previous clustering keys for each
partition and drop fragments that would cause duplication. That way if
any shard is still building a view, its progress will be returned,
and if many shards are still building, the returned value will indicate
the progress of a single arbitrary shard.
Fixes#4524
Tests:
unit(dev) + custom monotonicity checks from <tgrabiec@scylladb.com>
All Scylla code is written with "using namespace seastar", i.e., no
"seastar::" prefix for Seastar symbols. Document this in the coding style.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190610203948.18075-1-nyh@scylladb.com>
Fixes#4525
req_param uses boost::lexical cast to convert text->var.
However, lexical_cast does not handle textual booleans,
thus param=true causes not only wrong values, but
exceptions.
Message-Id: <20190610140511.15478-1-calle@scylladb.com>
Currently, each shard protects itself by not reading from rpc and the native
transport if in-flight requests consume too much memory for that shard. However,
if all shards then forward their requests to some other shard, then that shard
can easily run out of memory since its load can be multiplied by the number of
shards that send it requests.
To protect against this, use the new Seastar smp_service_group infrastructure.
We create three groups: read, write, and write ack (the latter is needed to
avoid ABBA deadlocks is shard A exhausts all its resources sending writes to shard B,
and shard B simulateously does the same; neither will be able to send
acknowledgements, so if the writes are throttled, they will never be unthrottled
until a timeout occurs).
Range scans are not addressed by this patch since they are handled by
multishard_mutation_query, which has its own complex cross-shard communication
scheme, but it be a similar solution.
Ref #1105 (missing range scan protection)
Tests: unit (dev)
Message-Id: <20190512142243.17795-1-avi@scylladb.com>
If a port value passed as a string this makes the cluster.connect() to
fail with Python3.4.
Let's fix this by explicitly declaring a 'port' argument as 'int'.
Fixes#4527
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <20190606133321.28225-1-vladz@scylladb.com>
We want to use the same branch on the other repos build-ami needs
as the one we're building for. Automatically find the current branch
using the `git branch` command.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20190606133648.15877-1-bhalevy@scylladb.com>
This patch adds a warning option to the user for situations where
rows count may get bigger than initially designed. Through the
warning, users can be aware of possible data modeling problems.
The threshold is initially set to '100,000'.
Tests: unit (dev)
Message-Id: <20190528075612.GA24671@shenzou.localdomain>
A lot of code in scylla is only reachable if SEASTAR_DEFAULT_ALLOCATOR
is not defined. In particular, refill_emergency_reserve in the default
allocator case is empty, but in the seastar allocator case it compacts
segments.
I am trying to debug a crash that seems to involve memory corruption
around the lsa allocator, and being able to use a debug build for that
would be awesome.
This patch reduces the differences between the two cases by having a
common segment_pool that defers only a few operations to different
segment_store implementations.
Tests: unit (debug, dev)
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20190606020937.118205-1-espindola@scylladb.com>
Prometheus needs to remember which "instance" (node) each measurement
came from. But it doesn't actually need Scylla to tell it the instance
name - it knows which node it got each measurement from.
After Seastar commit 79281ef287
which fixed Seastar issue https://github.com/scylladb/seastar/issues/477,
the "instance" label on measurements no longer comes from Scylla but rather
is added by Prometheus. This patch corrects the documentation to explain the
current situation, instead of incorrectly saying that Scylla adds the
"instance" label itself.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190602074629.14336-1-nyh@scylladb.com>
The relocatable Python is built from Fedora packages. Unfortunately TLS
certificates are in a different location on Debian variants, which
causes "node_exporter_install" to fail as follows:
Traceback (most recent call last):
File "/usr/lib/scylla/libexec/node_exporter_install", line 58, in <module>
data = curl('https://github.com/prometheus/node_exporter/releases/download/v{version}/node_exporter-{version}.linux-amd64.tar.gz'.format(version=VERSION), byte=True)
File "/usr/lib/scylla/scylla_util.py", line 40, in curl
with urllib.request.urlopen(req) as res:
File "/opt/scylladb/python3/lib64/python3.7/urllib/request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "/opt/scylladb/python3/lib64/python3.7/urllib/request.py", line 525, in open
response = self._open(req, data)
File "/opt/scylladb/python3/lib64/python3.7/urllib/request.py", line 543, in _open
'_open', req)
File "/opt/scylladb/python3/lib64/python3.7/urllib/request.py", line 503, in _call_chain
result = func(*args)
File "/opt/scylladb/python3/lib64/python3.7/urllib/request.py", line 1360, in https_open
context=self._context, check_hostname=self._check_hostname)
File "/opt/scylladb/python3/lib64/python3.7/urllib/request.py", line 1319, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1056)>
Unable to retrieve version information
node exporter setup failed.
Fix the problem by overriding the SSL_CERT_FILE environment variable to
point to the correct location of the TLS bundle.
Message-Id: <20190604175434.24534-1-penberg@scylladb.com>
The to_json_string utility implementation was based on const references
instead of views, which can be a source of unnecessary memory copying.
This patch migrates all to_json_string to use bytes_view and leaves
the const reference version as a thin wrapper.
Message-Id: <2bf9f1951b862f8e8a2211cb4e83852e7ac70c67.1559654014.git.sarna@scylladb.com>
"
Technically queue_reader already exists, however so far it was a
private utility in `multishard_writer.cc`. This mini-series makes it
public and generally useful. The interface is made safer and simpler and
the implementation is improved so it doesn't have two separate buffers.
Also, unit tests are added.
Tests: mutation_reader_test:debug/test_queue_reader, multishard_writer_test:debug
"
* 'queue_reader/v2' of https://github.com/denesb/scylla:
queue_reader: use the reader's buffer as the queue
Make queue_reader public
The queue reader currently uses two buffers, a `_queue` that the
producer pushes fragments into and its internal `_buffer` where these
fragments eventually end up being served to the consumer from.
This double buffering is not necessary. Change the reader to allow the
producer to push fragments directly into the internal `_buffer`. This
complicates the code a little bit, as the producer logic of
`seastar::queue` has to be folded into the queue reader. On the other
hand this introduces proper memory consumption management, as well as
reduces the amount of consumed memory and eliminates the possibility of
outside code mangling with the queue. Another big advantage of the
change is that there is now an explicit way to communicate the EOS
condition, no need to push a disengaged `mutation_fragment_opt`.
The producer of the queue reader now pushes the fragments into the
reader via an opaque `queue_reader_handle` object, which has the
producer methods of `seastar::queue`.
Existing users of queue readers are refactored to use the new interface.
Since the code is more complex now, unit tests are added as well.
We have a script in tree that fixes the schema for distributed system
tables, like tracing, should they change their schema. We use it all the
time but unfortunately it is not distributed with the scylla package,
which makes it using it harder (we want to do this in the server, but
consistent updates will take a while).
One of the problems with the script today that makes distributing it
harder is that it uses the python3 cassandra driver, that we don't want
to have as a server dependency. But now with the relocatable packages in
place there is no reaso not to just add it.
[avi: adjust tools/toolchain/image to point to a new image with
python3-cassandra-driver]
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20190603162447.24215-1-glauber@scylladb.com>
Unlike CentOS, Debian variants has python3 package on official repository,
so we don't have to use relocatable python3 on these distributions.
However, official python3 version is different on each distribution, we may
have issue because of that.
Also, our scripts and packaging implementation are becoming presuppose
existence of relocatable python3, it is causing issue on Debian
variants.
Switching to relocatable python3 on Debian variants avoid these issues,
it will easier to manage Scylla python3 environments accross multiple
distributions.
Fixes#4495
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190531112707.20082-1-syuu@scylladb.com>
Building on Ubuntu 18 or 19 following the current build instructions
doesn't work. Add information about a few pitfalls. Switch README.md
to recommending dbuild and move the details to HACKING.md.
Message-Id: <20190520152738.GA15198@atlas>
Unlike CentOS, Debian variants has python3 package on official repository,
so we don't have to use relocatable python3 on these distributions.
However, official python3 version is different on each distribution, we may
have issue because of that.
Also, our scripts and packaging implementation are becoming presuppose
existence of relocatable python3, it is causing issue on Debian
variants.
Switching to relocatable python3 on Debian variants avoid these issues,
it will easier to manage Scylla python3 environments accross multiple
distributions.
Fixes#4495
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190526105138.677-1-syuu@scylladb.com>
Apparently we are having some issues running iotune in the i3en instances,
as the values not always make sense. We believe it is something that XFS
is doing, and running fio directly on the device (no filesystem) provides
more meaningful results (more according to AWS published expected values).
For now, let's use fio instead. In this patch I have ran fio for our 4
dimensions in each of the three types of disks (large, xlarge, 3xlarge).
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20190524111454.27956-1-glauber@scylladb.com>
"
Before this patchset empty counters were incorrectly persisted for
MC format. No value was written to disk for them. The correct way
is to still write a header that informs the counter is empty.
We also need to make sure that reading wrongly persisted empty
counters works because customers may have sstables with wrongly
persisted empty counters.
Fixes#4363
"
* 'haaawk/4363/v3' of github.com:scylladb/seastar-dev:
sstables: add test for empty counters
docs: add CorrectEmptyCounters to sstable-scylla-format
sstables: Add a feature for empty counters in Scylla.db.
sstables: Write header for empty counters
sstables: Remove unused variables in make_counter_cell
sstables: Handle empty counter value in read path
"
One local utility function in cql_query_test.cc duplicates an existing
exception_predicate member. Another can be generalized for wider use
in the future. This patch accomplishes both, retiring a to-do item.
Tests: unit (dev)
"
* 'use-utils-predicate-in-cql_test' of https://github.com/dekimir/scylla:
tests/cql: Replace equery() with cquery_nofail()
tests: Add cquery_nofail() utility
tests: Drop redundant function