I was playing with the python3 interpreter trying to get pip to work,
just to see how far we can go. We don't really need pip, but I figured
it would be a good stress test to make sure that the process is working
and robust.
And it didn't really work, because although pip will correctly install
things into $relocatable_root/local/lib, sys.path will still refer to a
hardcoded /usr/local. While this should not affect Scylla, since we
expect to have all our modules in out path anyway -- and that path is
searched before /usr/local, it is still dangerous to make an absolute
reference like this.
Unfortunately, /usr/local/ it is included unconditionally by site.py,
which is executed when the interpreter is started and there is no
environment variable I found to change that (the help string refers to
PYTHONNOUSERSITE, but I found no mention of that in site.py whatsoever)
There is a way to tell site.py not to bother to add user sites, by
passing the -s flag, which this patch does.
Aside from doing that, we also enhance PYTHONPATH to include a reference
to ./local/{lib,lib64}/python<version>/site-packages.
After applying this patch, I was able to build an interpreter containing
only python3-pip and python3-setuptools, and build the relocatable
environment from there.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20190206052104.25927-1-glauber@scylladb.com>
We would like to deploy Scylla in constrained environments where
internet access is not permitted. In those environments it is not
possible to acquire the dependencies of Scylla from external repos and
the packages have to be sent alongside with its dependencies.
In older distributions, like CentOS7 there isn't a python3 interpreter
available. And while we can package one from EPEL this tends to break in
practice when installing the software in older patchlevels (for
instance, installing into RHEL7.3 when the latest is RHEL7.5).
The reason for that, as we saw in practice, is that EPEL may
not respect RHEL patchlevels and have the python interpreter depending
on newer versions of some system libraries.
virtualenv can be used to create isolated python enviornments, but it is
not designed for full isolation and I hit at least two roadblocks in
practice:
1) It doesn't copy the files, linking some instead. There is an
--always-copy option but it is broken (for years) in some
distributions.
2) Even when the above works, it still doesn't copy some files, relying
on the system files instead (one sad example was the subprocess
module that was just kept in the system and not moved to the
virtualenv)
This patch solves that problem by creating a python3 environment in a
directory with the modules that Scylla uses, and no other else. It is
essentially doing what vitualenv should do but doesn't. Once this
environment is assembled the binaries are then made relocatable the same
way the Scylla binary is.
One difference (for now) between the Scylla binary relocation process
and ours is that we steer away from LD_LIBRARY_PATH: the environment
variable is inherited by any child process steming from the caller,
which means that we are unable to use the subprocess module to call
system binaries like mkfs (which our scripts do a lot). Instead, we rely
on RUNPATH to tell the binary where to search for its libraries.
In terms of the python interpreter, PYTHONPATH does not need to be set
for this to work as the python interpreter will include the lib
directory in its PYTHONPATH. To confirm this, we executed the following
code:
bin/python3 -c "import sys; print('\n'.join(sys.path))"
with the interpreter unpacked to both /home/centos/glaubertmp/test/ and
/tmp. It yields respectively:
/home/centos/glaubertmp/test/lib64/python36.zip
/home/centos/glaubertmp/test/lib64/python3.6
/home/centos/glaubertmp/test/lib64/python3.6/lib-dynload
/home/centos/glaubertmp/test/lib64/python3.6/site-packages
and
/tmp/python/lib64/python36.zip
/tmp/python/lib64/python3.6
/tmp/python/lib64/python3.6/lib-dynload
/tmp/python/lib64/python3.6/site-packages
This was tested by moving the .tar.gz generated on my Fedora28 laptop to
a CentOS machine without python3 installed. I could then invoke
./scylla_python_env/python3 and use the interpreter to call 'ls' through
the subprocess module.
I have also tested that we can successfully import all the modules we listed
for installation and that we can read a sample yaml file (since PyYAML depends
on the system's libyaml, we know that this works)
Time to build:
real 0m15.935s
user 0m15.198s
sys 0m0.382s
Final archive size (uncompressed): 81MB
Final archive sie (compressed) : 25MB
Signed-off-by: Glauber Costa <glauber@scylladb.com>
--
v3:
- rewrite in python3
- do not use temporary directories, add directly to the archive. Only the python binary
have to be materialized
- Use --cacheonly for repoquery, and also repoquery --list in a second step to grab the file list
v2:
- do not use yum, resolve dependencies from installed packages instead
- move to scripts as Avi wants this not only for old offline CentOS
Before installing python files to their final location in install.sh,
replace them with a thunk so that they can work with our python3
interpreter. The way the thunk works, they will also work without our
python3 interpreter so unconditionally fixing them up is always safe.
I opt in this patch for fixing up just at install time to simplify
developer's life, who won't have to worry about this at all.
Note about the rpm .spec file: since we are relying on specific format
for the shebangs, we shouldn't let rpmbuild mess with them. Therefore,
we need to disable a global variable that controls that behavior (by
definition, Fedora rpmbuild will rewrite all shebangs to /usr/bin/python3)
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Given a python script at $DIR/script.py, this copies the script to
$DIR/libexec/script.py.bin, fixes its shebang to use /usr/bin/env instead
of an absolute path for the interpreter and replaces the original script
with a thunk that calls into that script.
PYTHONPATH is adjusted so that the original directory containing the script
can also serve as a source of modules, as would be originally intended.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
All of our python scripts are there and they are all installed
automatically into /usr/lib/scylla. By keeping scylla-housekeeping
separately we are just complicating our build process.
This would be just a minor annoyance but this broke the new relocatable
process for python3 that I am trying to put together because I forgot to
add the new location as a source for the scripts.
Therefore, I propose we start being more diligent with this and keeping
all scripts together for the future.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20190123191732.32126-2-glauber@scylladb.com>
On Scylla 3rdparty tools, we add /opt/scylladb/lib to LD_LIBRARY_PATH.
We use same directory for relocatable binaries, including libc.so.6.
Once we install both scylla-env package and relocatable version of scylla-server package, the loader tries to load libc from /opt/scylladb/lib then entire distribution become unusable.
We may able to use Obsoletes or Conflict tag on .rpm/.deb to avoid
install new Scylla package with scylla-env, but it's better & safer not to share
same directory for different purpose.
Fixes#3943
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190128023757.25676-1-syuu@scylladb.com>
Implementation of nodetool toppartiotion query, which samples most frequest PKs in read/write
operation over a period of time.
Content:
- data_listener classes: mechanism that interfaces with mutation readers in database and table classes,
- toppartition_query and toppartition_data_listener classes to implement toppartition-specific query (this
interfaces with data_listeners and the REST api),
- REST api for toppartitions query.
Uses Top-k structure for handling stream summary statistics (based on implementation in C*, see #2811).
What's still missing:
- JMX interface to nodetool (interface customization may be required),
- Querying #rows and #bytes (currently, only #partitions is supported).
Fixes#2811
* https://github.com/avikivity/scylla rafie_toppartitions_v7.1:
top_k: whitespace and minor fixes
top_k: map template arguments
top_k: std::list -> chunked_vector
top_k: support for appending top_k results
nodetool toppartitions: refactor table::config constructor
nodetool toppartitions: data listeners
nodetool toppartitions: add data_listeners to database/table
nodetool toppartitions: fully_qualified_cf_name
nodetool toppartitions: Toppartitions query implementation
nodetool toppartitions: Toppartitions query REST API
nodetool toppartitions: nodetool-toppartitions script
A Python script mimicking the nodetool toppartitions utility, utilizing Scylla REST API.
Examples:
$ ./nodetool-toppartitions --help
usage: nodetool-toppartitions [-h] [-k LIST_SIZE] [-s CAPACITY]
keyspace table duration
Samples database reads and writes and reports the most active partitions in a
specified table
positional arguments:
keyspace Name of keyspace
table Name of column family
duration Query duration in milliseconds
optional arguments:
-h, --help show this help message and exit
-k LIST_SIZE The number of the top partitions to list (default: 10)
-s CAPACITY The capacity of stream summary (default: 256)
$ ./nodetool-toppartitions ks test1 10000
READ
Partition Count
30 2
20 2
10 2
WRITE
Partition Count
30 1
20 1
10 1
Signed-off-by: Rafi Einstein <rafie@scylladb.com>
Since debian packaging system requires source package to compress tar
file, so let's use .gz compression.
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
scripts/create-relocatable-package.py:24:1: F401 'shutil' imported but unused
scripts/create-relocatable-package.py:24:1: F401 'tempfile' imported but unused
scripts/create-relocatable-package.py:24:16: E401 multiple imports on one line
scripts/create-relocatable-package.py:26:1: E302 expected 2 blank lines, found 1
scripts/create-relocatable-package.py:47:1: E305 expected 2 blank lines after class or function definition, found 1
scripts/create-relocatable-package.py:93:6: E225 missing whitespace around operator
Signed-off-by: Alexys Jacob <ultrabug@gentoo.org>
Message-Id: <20180917152520.5032-1-ultrabug@gentoo.org>
A relocatable package contains the Scylla (and iotune)
executables (in a bin/ directory), any libraries they may need (lib/)
the configuration file defaults (conf/) and supporting scripts (dist/).
The libraries are picked up from the host; including libc and the dynamic
linker (ld.so).
We also provide a thunk script that forces the library path
(LD_LIBRARY_PATH) to point at our libraries, and overrides the
interpreter to point at our ld.so.
With these files, it is possible to run a fully functional Scylla
instance on any Linux distribution. This is similar to chroot or
containers, except that we run in the same namespace as the host.
The packages are created by running
ninja build/release/scylla-package.tar
or
ninja --mode debug build/debug/scylla-package.tar
Message-Id: <20180828065352.30730-1-avi@scylladb.com>
This will allow continuous integration to use the optimal number
of compiler jobs, without having to resort to complex calculations
from its scripting environment.
Message-Id: <20180722172050.13148-1-avi@scylladb.com>
scylla_install_pkg is initially written for one-liner-installer, but now
it only used for creating AMI, and it just few lines of code, so it should be
merge into scylla_install_ami script.
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20180612150106.26573-2-syuu@scylladb.com>
This patch adds a scripts/find-maintainer script, similar to
script/get_maintainer.pl in Linux, which looks up maintainers and
reviewers for a specific file from a MAINTAINERS file.
Example usage looks as follows:
$ ./scripts/find-maintainer cql3/statements/create_view_statement.cc
CQL QUERY LANGUAGE
Tomasz Grabiec <tgrabiec@scylladb.com> [maintainer]
Pekka Enberg <penberg@scylladb.com> [maintainer]
MATERIALIZED VIEWS
Duarte Nunes <duarte@scylladb.com> [maintainer]
Pekka Enberg <penberg@scylladb.com> [maintainer]
Nadav Har'El <nyh@scylladb.com> [reviewer]
Duarte Nunes <duarte@scylladb.com> [reviewer]
The main objective of this script is to make it easier for people to
find reviewers and maintainers for their patches.
Message-Id: <20180119075556.31441-1-penberg@scylladb.com>
Now we can cross build our .rpm/.deb packages, so let's extend AMI build script
to support cross build, too.
Also Ubuntu 16.04 support added, since it's latest Ubuntu LTS release.
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1510247204-2899-1-git-send-email-syuu@scylladb.com>
This fix splits build_ami.sh --repo to three different options:
--repo-for-install is for Scylla package installation, only valid
during AMI construction.
--repo-for-update will be stored at /etc/yum.repos.d/scylla.repo, to
receive update package on AMI.
--repo is both, for installation and update.
Fixes#1872
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1480438858-6007-1-git-send-email-syuu@scylladb.com>
This patch adds an `update-version` script for updating the Scylla
version number in `SCYLLA-VERSION-GEN` file and committing the change to
git.
Example use:
$ ./scripts/update-version 1.4.0
which results into the following git commit:
commit 4599c16d9292d8d9299b40a3e44ef7ee80e3c3cf
Author: Pekka Enberg <penberg@scylladb.com>
Date: Fri Oct 28 10:24:52 2016 +0300
release: prepare for 1.4.0
diff --git a/SCYLLA-VERSION-GEN b/SCYLLA-VERSION-GEN
index 753c982..eba2da4 100755
--- a/SCYLLA-VERSION-GEN
+++ b/SCYLLA-VERSION-GEN
@@ -1,6 +1,6 @@
#!/bin/sh
-VERSION=666.development
+VERSION=1.4.0
if test -f version
then
Message-Id: <1477639560-10896-1-git-send-email-penberg@scylladb.com>
There was no way to setup correct repo when AMI is building by --localrpm option, since AMI does not have access to 'version' file, and we don't passed repo URL to the AMI.
So detect optimal repo path when starting build AMI, passes repo URL to the AMI, setup it correctly.
Note: this changes behavor of build_ami.sh/scylla_install_pkg's --repo option.
It was repository URL, but now become .repo/.list file URL.
This is optimal for the distribution which requires 3rdparty packages to install scylla, like CentOS7.
Existing shell scripts which invoking build_ami.sh are need to change in new way, such as our Jenkins jobs.
Fixes#1414
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1469636377-17828-1-git-send-email-syuu@scylladb.com>
Since we added scylla-conf package, we cannot install scylla-server/-tools without the package, because of this --localrpm is failing.
So copy scylla-conf package to AMI, and install it to fix the problem.
We choosed #!/bin/sh for shebang when we started to implement installer scripts, not bash.
After we started to work on Ubuntu, we found that we mistakenly used bash syntax on AMI script, it caused error since /bin/sh is dash on Ubuntu.
So we changed shebang to /bin/bash for the script, from that time we have both sh scripts and bash scripts.
(2f39e2e269)
If we use bash syntax on sh scripts, it won't work on Ubuntu but works on Fedora/CentOS, could be very easy to confusing.
So switch all scripts to #!/bin/bash. It will much safer.
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1460594643-30666-1-git-send-email-syuu@scylladb.com>
scylla-ami.sh moved some ami specific files. This parts have been
dropped when converging scylla-ami into scylla_install. Fixing that.
Signed-off-by: Shlomi Livne <shlomi@scylladb.com>
$NAME is full name of distribution, for script it is too long.
$ID is shortened one, which is more useful.
Signed-off-by: Takuya ASADA <syuu@scylladb.com>