This reverts commit b1226fb15a. When the
data volume is mounted from the host (as is usual in container
deployments), we can't expect that the files will be owned by the
in-container scylla user. So that commit didn't really fix#4536.
A follow-up patch will relax the check so it passes in a container
environment.
Don't let perftune.py print false alarm error message when we calculate
a compute CPU set for tuning modes.
This may happen when we calculate a CPU set for non-MQ tuning modes on
small systems on which these modes are forbidden because they would
result in a zero CPU set, e.g. sq_split on a system with a single
physical core.
We are going to utilize a newly introduced --get-cpu-mask-quiet execution
mode introduced to the seastar/script/perftune.py by the
"perftune.py: introduce --get-cpu-mask-quiet" series which would return
a zero CPU set if that's what it turns out to be instead of exiting with
an error what --get-cpu-mask would do in such a case.
The rest of scylla_util.py logic is going to handle a zero CPU set
returned by get_mode_cpuset() correctly.
Fixes#4211Fixes#4443
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <20190731212901.9510-1-vladz@scylladb.com>
On previous commit ac9b115a8f, install.sh requires to specify single package using --pkg, there is no way to select all.
It should be select all packages when running install.sh without --pkg.
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190731013245.5857-1-syuu@scylladb.com>
Currently, install.sh just used for building .rpm, we have similar build script
under dist/debian, sometimes it become inconsistent with install.sh.
Since most of package build process are same, we should share install.sh on both
.rpm and .deb package build process.
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190725123207.2326-1-syuu@scylladb.com>
Currently scylla-server.service uses DefaultTimeoutStopSec = 90, if Scylla
does not able to clean-shutdown in 90sec we may have data corruption on the node.
Since we already set TimeoutStartSec = 900, we can use TimeoutSec to set both
TimeoutStartSec and TimeoutStopSec to 900.
See #4700
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190717095416.10652-1-syuu@scylladb.com>
We still has "{{^jessie}}" tag on scylla-server systemd unit file to
skip using AmbientCapabilities on Debian 8, but it does not able to work
anymore since we moved to single binary .deb package for all debian variants,
we must share same systemd unit file across all Debian variants.
To do so we need to have separated file on /etc/systemd to define
AmbientCapabilities, create the file while running postinst script only
if distribution is not Debian 8, just like we do in .rpm.
See #3344
See #3486
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190619064224.23035-1-syuu@scylladb.com>
Since we cannot use dh --with=systemd because we don't want to
automatically enabling systemd units, manage them by our setup scripts,
we have to do 'systemctl daemon-reload' manually.
(On dh --with=systemd, systemd helper automatically provides such
scirpts)
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190618000210.28972-1-syuu@scylladb.com>
Currently NIC selection prompt on scylla_setup just proceed setup when
user just pressed Enter key on the prompt.
The prompt should ask NIC name again until user input correct NIC name.
Fixes#4517
Message-Id: <20190617124925.11559-1-syuu@scylladb.com>
We used to use /opt/scylladb just for Scylla build toolchain and
dependency libraries, not for Scylla main package.
But since we merged relocatable package, Scylla main binary and
dependency libraries are all located under /opt/scylladb, only
setup scripts remained on /usr/lib/scylla.
It strange to keep using both /usr/lib/<app name> and /opt/<app name>,
we should merge them into single place.
Message-Id: <20190614011038.17827-1-syuu@scylladb.com>
Relocation of python scripts mentions scylla-server in paths explicitly.
It should use {{product}} instead. The current build is failing when
{{product}} is different than scylla-server
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20190613012518.28784-1-glauber@scylladb.com>
On branch-3.1 / master, we are getting following error:
ERROR 2019-06-11 10:58:49,156 [shard 0] database - /var/lib/scylla/data: File not owned by current euid: 0. Owner is: 999
ERROR 2019-06-11 10:58:49,156 [shard 0] init - Failed owner and mode verification: std::runtime_error (File not owned by current euid: 0. Owner is: 999)
ERROR 2019-06-11 10:58:49,156 [shard 0] database - /var/lib/scylla/hints: File not owned by current euid: 0. Owner is: 999
ERROR 2019-06-11 10:58:49,156 [shard 0] init - Failed owner and mode verification: std::runtime_error (File not owned by current euid: 0. Owner is: 999)
ERROR 2019-06-11 10:58:49,156 [shard 0] database - /var/lib/scylla/commitlog: File not owned by current euid: 0. Owner is: 999
ERROR 2019-06-11 10:58:49,156 [shard 0] init - Failed owner and mode verification: std::runtime_error (File not owned by current euid: 0. Owner is: 999)
ERROR 2019-06-11 10:58:49,156 [shard 0] database - /var/lib/scylla/view_hints: File not owned by current euid: 0. Owner is: 999
ERROR 2019-06-11 10:58:49,156 [shard 0] init - Failed owner and mode verification: std::runtime_error (File not owned by current euid: 0. Owner is: 999)
It seems like owner verification of data directory fails because
scylla-server process is running in root but data directory owned by
scylla, so we should run services as scylla user.
Fixes#4536
Message-Id: <20190611113142.23599-1-syuu@scylladb.com>
To avoid 'Bad permmisons' error when user changed default umask, we need
to verify system umask is acceptable for scylla-server.
Fixes#4157
Message-Id: <20190612130343.6043-1-syuu@scylladb.com>
Introduced in 513d01d53e
The script is trying to determine the branch to shallow clone
when an rpm is missing and has to be built.
This functionality in the current implementation assumes it is being run inside
a git repository, but that must not be the case if the script is triggered after
local rpms were placed on the local directory.
This happens when putting all necessary rpm files in: dist/ami/files
And then running: dist/ami/build_ami.sh --localrpm
The dist/ami/ and dist/ami/files are the only ones required for this action so
querying the git repository in that situation makes no sense.
Fixes#4535
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20190611112455.13862-1-bhalevy@scylladb.com>
We want to use the same branch on the other repos build-ami needs
as the one we're building for. Automatically find the current branch
using the `git branch` command.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20190606133648.15877-1-bhalevy@scylladb.com>
Unlike CentOS, Debian variants has python3 package on official repository,
so we don't have to use relocatable python3 on these distributions.
However, official python3 version is different on each distribution, we may
have issue because of that.
Also, our scripts and packaging implementation are becoming presuppose
existence of relocatable python3, it is causing issue on Debian
variants.
Switching to relocatable python3 on Debian variants avoid these issues,
it will easier to manage Scylla python3 environments accross multiple
distributions.
Fixes#4495
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190531112707.20082-1-syuu@scylladb.com>
Unlike CentOS, Debian variants has python3 package on official repository,
so we don't have to use relocatable python3 on these distributions.
However, official python3 version is different on each distribution, we may
have issue because of that.
Also, our scripts and packaging implementation are becoming presuppose
existence of relocatable python3, it is causing issue on Debian
variants.
Switching to relocatable python3 on Debian variants avoid these issues,
it will easier to manage Scylla python3 environments accross multiple
distributions.
Fixes#4495
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190526105138.677-1-syuu@scylladb.com>
Apparently we are having some issues running iotune in the i3en instances,
as the values not always make sense. We believe it is something that XFS
is doing, and running fio directly on the device (no filesystem) provides
more meaningful results (more according to AWS published expected values).
For now, let's use fio instead. In this patch I have ran fio for our 4
dimensions in each of the three types of disks (large, xlarge, 3xlarge).
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20190524111454.27956-1-glauber@scylladb.com>
Users may want to know which version of packages are used for the AMI,
it's good to have it on AMI tags and description.
To do this, we need to download .rpm from specified .repo, extract
version information from .rpm.
Fixes#4499
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190520123924.14060-2-syuu@scylladb.com>
To make product name templatization works correctly, we cannot use
"debian/scylla-server" as package contents directory path,
need to use template like "debian/{{product}}-server" instead.
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190517121946.18248-1-syuu@scylladb.com>
Right now if the user tries to execute this in an unrecognized OS, the
following will be thrown:
Traceback (most recent call last):
File "/usr/lib/scylla/libexec/scylla_setup", line 214, in <module>
do_verify_package('scylla-enterprise-jmx')
File "/usr/lib/scylla/libexec/scylla_setup", line 73, in do_verify_package
if res != 0:
UnboundLocalError: local variable 'res' referenced before assignment
It would be a lot nicer to exit gracefully and print a messge saying what
is going on. This was caught when running on OEL, which the previous patch
fixed. Still, there are other unknown OS out there the users may try to run
on.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Oracle Linux is a RHEL-like distribution and we support it just fine, but our
new incarnation of scylla_setup is failing to recognize it.
os-release for OEL is a bit different. It doesn't have an ID_LIKE string, and
only shows an ID string, which is set to 'ol'. So let's recognize this.
Fixes: #4493
Branches: 3.1
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Since other build_*.sh are for running inside extracted relocatable
package, they have SCYLLA-PRODUCT-FILE on top of the directory,
but build_ami.sh is not running in such condition, we need to run
SCYLLA-VERSION-GEN first, then refer to build/SCYLLA-PRODUCT-FILE.
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190509110621.27468-1-syuu@scylladb.com>
AWS just released their new instances, the i3en instances. The instance
is verified already to work well with scylla, the only adjustments that
we need is advertise that we support it, and pre-fill the disk
information according to the performance numbers obtained by running the
instance.
Fixes#4486
Branches: 3.1
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20190508170831.6003-1-glauber@scylladb.com>
Whenever the iotune_args array uses "--smp", it needs cpudata.smp()
which returns an integer instead of a string. So when iotune_args is
passed to subprocess.check_call(), it actually throws "TypeError:
expected str, bytes or os.PathLike object, not int" but
"%s did not pass validation tests, it may not be on XFS..." is shown as
the exception.
Even though the user inputs correct arguments, it might still throw an
error and confuse the user that he/she has not passed the right
arguments.
One simple fix is to use str(cpudata.smp()) instead of cpudata.smp().
Signed-off-by: JP-Reddy <guthijp.reddy@gmail.com>
Message-Id: <20190406070118.48477-1-guthijp.reddy@gmail.com>
scylla-housekeeping always wants to run in the installation to check if
we are running the latest version. This happens regardless of whether or
not we said yes or no to the housekeeping scylla_setup question - as
that question only deals with whether or not we want to do this through
a timer.
It is fine to try to run scylla-housekeeping, as long as we time it out.
The current code doesn't.
The naive solution is to add a timeout parameter to urllib.request.open.
However, that timeout is not respected and in my tests I saw real
timeouts up to four times higher the timeout we set. For a reasonable 5s
timeout, this mean a 20s real timeout which can lead to a very bad user
experience. This seems to be a known problem with this module according
to a quick Google search.
This patch then takes a slightly more complex solution and uses
multiprocess to enforce a well-defined user-visible timeout.
Fixes#3980
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20190506122335.5707-1-glauber@scylladb.com>
The setup script asks the user whether or not housekeeping should
be called, and in the first time the script is executed this decision
is respected.
However if the script is invoked again, that decision is not respected.
This is because the check has the form:
if (housekeeping_cfg_file_exists) {
version_check = ask_user();
}
if (version_check) { do_version_check() } else { dont_do_it() }
When it should have the form:
if (housekeeping_cfg_file_exists) {
version_check = ask_user();
if (version_check) { do_version_check() } else { dont_do_it() }
}
(Thanks python)
This is problematic in systems that are not connected to the internet, since
housekeeping will fail to run and crash the setup script.
Fixes#4462
Branches: master, branch-3.1
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20190502034211.18435-1-glauber@scylladb.com>
Newer versions of RHEL ship the os-release file with newlines in the
end, which our script was not prepared to handle. As such, scylla_setup
would fail.
This patch makes our OS detection robust against that.
Fixes#4473
Branches: master, branch-3.1
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20190502152224.31307-1-glauber@scylladb.com>
This patch add the node_exporter to the docker image.
It install it create and run a service with it.
After this patch node_exporter will run and will be part of scylla
Docker image.
Fixes#4300
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <20190421130643.6837-1-amnon@scylladb.com>
To prevent running entrypoint script in another python3 package like
python36 in EPEL, move /opt/scylladb/python3/bin to top of $PATH.
It won't happen on this container image, but may occurs when user tries to
extend the image.
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190417165806.12212-1-syuu@scylladb.com>
perftune.py executes hwloc-calc, the command is now provided as
relocatable binary, placed under /opt/scylladb/bin.
So we need to add the directory to PATH when calling
subprocess.check_output(), but our utility function already do that,
switch to it.
Fixes#4443
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190418124345.24973-1-syuu@scylladb.com>
When we add product name customization, we mistakenly defined the
parameter on each package build script.
Number of script is increasing since we recently added relocatable
python3 package, we should merge it in single place.
Also we should save the parameter on relocatable package, just like
version-release parameters.
So move the definition to SCYLLA-VERSION-GEN, save it to
build/SCYLLA-PRODUCT-FILE then archive it to relocatable package.
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190417163335.10191-1-syuu@scylladb.com>
Since we moved relocatable .rpm now Scylla able to run on Amazon Linux
2.
However, is_redhat_variant() on scylla_util.py does not works on Amazon
Linux 2, since it does not have /etc/redhat-release.
So we need to switch to /etc/os-release, use ID_LIKE to detect Redhat
variants/Debian variants.
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190417115634.9635-1-syuu@scylladb.com>
Switch to relocatable python3 instead of EPEL's python3 on docker-entrypoint.py.
Also drop uneeded dependencies, since we switched to relocatable scylla
image.
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190417111024.6604-1-syuu@scylladb.com>
We have to run this script in python2, since we dropped EPEL from
dependencies, and the script is installer for rpms so we cannot use
relocatable python3 for it.
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190411151858.2292-1-syuu@scylladb.com>
Although curl is widely available, there is no reason to depend on it.
There are mainly two users, as indicated by grep:
1) scylla-housekeeping
2) scripts within the AMI
3) docker image
The AMI has its own RPM and it already depends on curl. While we could
get rid of the curl dependency there too, we can do that later. Docker
is its own thing and it only needs it at build time anyway.
For the main scylla repo, this patch changes scylla-housekeeping so as
not to depend on the curl binary and use urllib directly instead. We can
then remove curl from our dependency list.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20190411125642.9754-1-glauber@scylladb.com>
"
Calculation of IO properties is slightly wrong for i3.metal, because we get
the number of disks wrong. The reason for that is our check for ephemeral nvme
disks, that pre-date the time in which root devices were exposed as nvme devices
(nitro and metal instances).
"
toolchain updated with python3-psutil
* 'ec2fixes' of github.com:glommer/scylla:
scylla_util.py: do not include root disks in ephemeral list
scylla-python3: include the psutil module
fix typo in scylla_ec2_check
Nitro instances (and metal ones) put their root device in nvme (as a
protocol. it is still EBS). Our algorithm so far has relied on parsing
the nvme devices to figure out which ones are ephemeral but it will
break for those instances.
Out of our supported instances so far, the i3.metal is the only one
in which this breaks.
Signed-off-by: Glauber Costa <glauber@scylladb.com>