Compare commits

...

58 Commits

Author SHA1 Message Date
yaronkaikov
bebfd7b26c release: prepare for 3.1.0.rc3 2019-07-25 12:15:55 +03:00
Tomasz Grabiec
03b48b2caf database: Add missing partition slicing on streaming reader recreation
streaming_reader_lifecycle_policy::create_reader() was ignoring the
partition_slice passed to it and always creating the reader for the
full slice.

That's wrong because create_reader() is called when recreating a
reader after it's evicted. If the reader stopped in the middle of
partition we need to start from that point. Otherwise, fragments in
the mutation stream will appear duplicated or out of ordre, violating
assumptions of the consumers.

This was observed to result in repair writing incorrect sstables with
duplicated clustering rows, which results in
malformed_sstable_exception on read from those sstables.

Fixes #4659.

In v2:

  - Added an overload without partition_slice to avoid changing existing users which never slice

Tests:

  - unit (dev)
  - manual (3 node ccm + repair)

Backport: 3.1
Reviewd-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <1563451506-8871-1-git-send-email-tgrabiec@scylladb.com>
(cherry picked from commit 7604980d63)
2019-07-22 15:08:09 +03:00
Avi Kivity
95362624bc Merge "Fix disable_sstable_write synchronization with on_compaction_completion" from Benny
"
disable_sstable_write needs to acquire _sstable_deletion_sem to properly synchronize
with background deletions done by on_compaction_completion to ensure no sstables will
be created or deleted during reshuffle_sstables after
storage_service::load_new_sstables disables sstable writes.

Fixes #4622

Test: unit(dev), nodetool_additional_test.py migration_test.py
"

* 'scylla-4622-fix-disable-sstable-write' of https://github.com/bhalevy/scylla:
  table: document _sstables_lock/_sstable_deletion_sem locking order
  table: disable_sstable_write: acquire _sstable_deletion_sem
  table: uninline enable_sstable_write
  table: reshuffle_sstables: add log message

(cherry picked from commit 43690ecbdf)
2019-07-22 13:47:25 +03:00
Asias He
7865c314a5 repair: Avoid deadlock in remove_repair_meta
Start n1, n2
Create ks with rf = 2
Run repair on n2
Stop n2 in the middle of repair
n1 will notice n2 is DOWN, gossip handler will remove repair instance
with n2 which calls remove_repair_meta().

Inside remove_repair_meta(), we have:

```
1        return parallel_for_each(*repair_metas, [repair_metas] (auto& rm) {
2            return rm->stop();
3        }).then([repair_metas, from] {
4            rlogger.debug("Removed all repair_meta for single node {}", from);
5        });
```

Since 3.1, we start 16 repair instances in parallel which will create 16
readers.The reader semaphore is 10.

At line 2, it calls

```
6    future<> stop() {
7       auto gate_future = _gate.close();
8       auto writer_future = _repair_writer.wait_for_writer_done();
9       return when_all_succeed(std::move(gate_future), std::move(writer_future));
10    }
```

The gate protects the reader to read data from disk:

```
11 with_gate(_gate, [] {
12   read_rows_from_disk
13        return _repair_reader.read_mutation_fragment() --> calls reader() to read data
14 })
```

So line 7 won't return until all the 16 readers return from the call of
reader().

The problem is, the reader won't release the reader semaphore until the
reader is destroyed!
So, even if 10 out of the 16 readers have finished reading, they won't
release the semaphore. As a result, the stop() hangs forever.

To fix in short term, we can delete the reader, aka, drop the the
repair_meta object once it is stopped.

Refs: #4693
(cherry picked from commit 8774adb9d0)
2019-07-21 13:31:08 +03:00
Asias He
0e6b62244c streaming: Do not open rpc stream connection if ranges are not relevant to a shard
Given a list of ranges to stream, stream_transfer_task will create an
reader with the ranges and create a rpc stream connection on all the shards.

When user provides ranges to repair with -st -et options, e.g.,
using scylla-manger, such ranges can belong to only one shard, repair
will pass such ranges to streaming.

As a result, only one shard will have data to send while the rpc stream
connections are created on all the shards, which can cause the kernel
run out of ports in some systems.

To mitigate the problem, do not open the connection if the ranges do not
belong to the shard at all.

Refs: #4708
(cherry picked from commit 64a4c0ede2)
2019-07-21 10:23:49 +03:00
Kamil Braun
9d722a56b3 Fix timestamp_type_impl::timestamp_from_string.
Now it accepts the 'z' or 'Z' timezone, denoting UTC+00:00.
Fixes #4641.

Signed-off-by: Kamil Braun <kbraun@scylladb.com>
(cherry picked from commit 4417e78125)
2019-07-17 21:54:47 +03:00
Eliran Sinvani
7009d5fb23 auth: Prevent race between role_manager and pasword_authenticator
When scylla is started for the first time with PasswordAuthenticator
enabled, it can be that a record of the default superuser
will be created in the table with the can_login and is_superuser
set to null. It happens because the module in charge of creating
the row is the role manger and the module in charge of setting the
default password salted hash value is the password authenticator.
Those two modules are started together, it the case when the
password authenticator finish the initialization first, in the
period until the role manager completes it initialization, the row
contains those null columns and any loging attempt in this period
will cause a memory access violation since those columns are not
expected to ever be null. This patch removes the race by starting
the password authenticator and autorizer only after the role manger
finished its initialization.

Tests:
  1. Unit tests (release)
  2. Auth and cqlsh auth related dtests.

Fixes #4226

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
Message-Id: <20190714124839.8392-1-eliransin@scylladb.com>
(cherry picked from commit 997a146c7f)
2019-07-15 21:18:05 +03:00
Takuya ASADA
eb49fae020 reloc: provide libthread_db.so.1 to debug thread on gdb
In scylla-debuginfo package, we have /usr/lib/debug/opt/scylladb/libreloc/libthread_db-1.0.so-666.development-0.20190711.73a1978fb.el7.x86_64.debug
but we actually does not have libthread_db.so.1 in /opt/scylladb/libreloc
since it's not available on ldd result with scylla binary.

To debug thread, we need to add the library in a relocatable package manually.

Fixes #4673

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190711111058.7454-1-syuu@scylladb.com>
(cherry picked from commit 842f75d066)
2019-07-15 14:50:56 +03:00
Asias He
92bf928170 repair: Allow repair when a replica is down
Since commit bb56653 (repair: Sync schema from follower nodes before
repair), the behaviour of handling down node during repair has been
changed.  That is, if a repair follower is down, it will fail to sync
schema with it and the repair of the range will be skipped. This means
a range can not be repaired unless all the nodes for the replicas are up.

To fix, we filter out the nodes that is down and mark the repair is
partial and repair with the nodes that are still up.

Tests: repair_additional_test:RepairAdditionalTest.repair_with_down_nodes_2b_test
Fixes: #4616
Backports: 3.1

Message-Id: <621572af40335cf5ad222c149345281e669f7116.1562568434.git.asias@scylladb.com>
(cherry picked from commit 39ca044dab)
2019-07-11 11:44:49 +03:00
Rafael Ávila de Espíndola
deac0b0e94 mc writer: Fix exception safety when closing _index_writer
This fixes a possible cause of #4614.

From the backtrace in that issue, it looks like a file is being closed
twice. The first point in the backtrace where that seems likely is in
the MC writer.

My first idea was to add a writer::close and make it the responsibility
of the code using the writer to call it. That way we would move work
out of the destructor.

That is a bit hard since the writer is destroyed from
flat_mutation_reader::impl::~consumer_adapter and that would need to
get a close function too.

This patch instead just fixes an exception safety issue. If
_index_writer->close() throws, _index_writer is still valid and
~writer will try to close it again.

If the exception was thrown after _completed.set_value(), that would
explain the assert about _completed.set_value() being called twice.

With this patch the path outside of the destructor now moves the
writer to a local variable before trying to close it.

Fixes #4614
Message-Id: <20190710171747.27337-1-espindola@scylladb.com>

(cherry picked from commit 281f3a69f8)
2019-07-11 11:42:48 +03:00
kbr-
c294000113 Implement tuple_type_impl::to_string_impl. (#4645)
Resolves #4633.

Signed-off-by: Kamil Braun <kbraun@scylladb.com>
(cherry picked from commit 8995945052)
2019-07-08 11:08:10 +03:00
Avi Kivity
18bb2045aa Update seastar submodule
* seastar 4cdccae53b...82637dcab4 (1):
  > perftune.py: fix the i3 metal detection pattern

Ref #4057.
2019-07-02 13:49:21 +03:00
Avi Kivity
5e3276d08f Update seastar submodule to point to scylla-seastar.git
This allows us to add 3.1 specific patches to Seastar.
2019-07-02 13:47:50 +03:00
Piotr Sarna
acff367ea8 main: stop view builder conditionally
The view builder is started only if it's enabled in config,
via the view_building=true variable. Unfortunately, stopping
the builder was unconditional, which may result in failed
assertions during shutdown. To remedy this, view building
is stopped only if it was previously started.

Fixes #4589

(cherry picked from commit efa7951ea5)
2019-06-26 10:45:50 +03:00
Tomasz Grabiec
e39724a343 Merge "Sync schema before repair" from Asias
This series makes sure new schema is propagated to repair master and
follower nodes before repair.

Fixes #4575

* dev.git asias/repair_pull_schema_v2:
  migration_manager: Add sync_schema
  repair: Sync schema from follower nodes before repair

(cherry picked from commit 269e65a8db)
2019-06-26 09:35:42 +02:00
Asias He
31c4db83d8 repair: Avoid searching all the rows in to_repair_rows_on_wire
The repair_rows in row_list are sorted. It is only possible for the
current repair_row to share the same partition key with the last
repair_row inserted into repair_row_on_wire. So, no need to search from
the beginning of the repair_rows_on_wire to avoid quadratic complexity.
To fix, look at the last item in repair_rows_on_wire.

Fixes #4580
Message-Id: <08a8bfe90d1a6cf16b67c210151245879418c042.1561001271.git.asias@scylladb.com>

(cherry picked from commit b99c75429a)
2019-06-25 12:48:37 +02:00
Tomasz Grabiec
433cb93f7a Merge "Use same schema version for repair nodes" from Asias
This patch set fixes repair nodes using different schema version and
optimizes the hashing thanks to the fact now all nodes uses same schema
version.

Fixes: #4549

* seastar-dev.git asias/repair_use_same_schema.v3:
  repair: Use the same schema version for repair master and followers
  repair: Hash column kind and id instead of column name and type name

(cherry picked from commit cd1ff1fe02)
2019-06-23 20:57:16 +03:00
Avi Kivity
f553819919 Merge "Fix infinite paging for indexed queries" from Piotr
"
Fixes #4569

This series fixes the infinite paging for indexed queries issue.
Before this fix, paging indexes tended to end up in an infinite loop
of returning pages with 0 results, but has_more_pages flag set to true,
which confused the drivers.

Tests: unit(dev)
Branches: 3.0, 3.1
"

* 'fix_infinite_paging_for_indexed_queries' of https://github.com/psarna/scylla:
  tests: add test case for finishing index paging
  cql3: fix infinite paging for indexed queries

(cherry picked from commit 9229afe64f)
2019-06-23 20:54:27 +03:00
Nadav Har'El
48c34e7635 storage_proxy: fix race and crash in case of MV and other node shutdown
Recently, in merge commit 2718c90448,
we added the ability to cancel pending view-update requests when we detect
that the target node went down. This is important for view updates because
these have a very long timeout (5 minutes), and we wanted to make this
timeout even longer.

However, the implementation caused a race: Between *creating* the update's
request handler (create_write_response_handler()) and actually starting
the request with this handler (mutate_begin()), there is a preemption point
and we may end up deleting the request handler before starting the request.
So mutate_begin() must gracefully handle the case of a missing request
handler, and not crash with a segmentation fault as it did before this patch.

Eventually the lifetime management of request handlers could be refactored
to avoid this delicate fix (which requires more comments to explain than
code), or even better, it would be more correct to cancel individual writes
when a node goes down, not drop the entire handler (see issue #4523).
However, for now, let's not do such invasive changes and just fix bug that
we set out to fix.

Fixes #4386.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20190620123949.22123-1-nyh@scylladb.com>
(cherry picked from commit 6e87bca65d)
2019-06-23 20:53:01 +03:00
Nadav Har'El
7f85b30941 Fix deciding whether a query uses indexing
The code that decides whether a query should used indexing was buggy - a partition key index might have influenced the decision even if the whole partition key was passed in the query (which effectively means that indexing it is not necessary).

Fixes #4539

Closes https://github.com/scylladb/scylla/pull/4544

Merged from branch 'fix_deciding_whether_a_query_uses_indexing' of git://github.com/psarna/scylla
  tests: add case for partition key index and filtering
  cql3: fix deciding if a query uses indexing

(cherry picked from commit 6aab1a61be)
2019-06-18 13:25:18 +03:00
Hagit Segev
7d14514b8a release: prepare for 3.1.0.rc2 2019-06-16 20:28:31 +03:00
Piotr Sarna
35f906f06f tests: add a test case for filtering clustering key
The test cases makes sure that clustering key restriction
columns are fetched for filtering if they form a clustering key prefix,
but not a primary key prefix (partition key columns are missing).

Ref #4541
Message-Id: <3612dc1c6c22c59ac9184220a2e7f24e8d18407c.1560410018.git.sarna@scylladb.com>

(cherry picked from commit 2c2122e057)
2019-06-16 14:36:52 +03:00
Piotr Sarna
2c50a484f5 cql3: fix qualifying clustering key restrictions for filtering
Clustering key restrictions can sometimes avoid filtering if they form
a prefix, but that can happen only if the whole partition key is
restricted as well.

Ref #4541
Message-Id: <9656396ee831e29c2b8d3ad4ef90c4a16ab71f4b.1560410018.git.sarna@scylladb.com>

(cherry picked from commit c4b935780b)
2019-06-16 14:36:52 +03:00
Piotr Sarna
24ddb46707 cql3: fix fetching clustering key columns for filtering
When a column is not present in the select clause, but used for
filtering, it usually needs to be fetched from replicas.
Sometimes it can be avoided, e.g. if primary key columns form a valid
prefix - then, they will be optimized out before filtering itself.
However, clustering key prefix can only be qualified for this
optimization if the whole partition key is restricted - otherwise
the clustering columns still need to be present for filtering.

This commit also fixes tests in cql_query_test suite, because they now
expect more values - columns fetched for filtering will be present as
well (only internally, the clients receive only data they asked for).

Fixes #4541
Message-Id: <f08ebae5562d570ece2bb7ee6c84e647345dfe48.1560410018.git.sarna@scylladb.com>

(cherry picked from commit adeea0a022)
2019-06-16 14:36:52 +03:00
Dejan Mircevski
f2fc3f32af tests: Add cquery_nofail() utility
Most tests await the result of cql_test_env::execute_cql().  Most
would also benefit from reporting errors with top-level location
included.

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
(cherry picked from commit a9849ecba7)
2019-06-16 14:36:52 +03:00
Asias He
c9f488ddc2 repair: Avoid writing row with same partition key and clustering key more than once
Consider

   master: row(pk=1, ck=1, col=10)
follower1: row(pk=1, ck=1, col=20)
follower2: row(pk=1, ck=1, col=30)

When repair runs, master fetches row(pk=1, ck=1, col=20) and row(pk=1,
ck=1, col=30) from follower1 and follower2.

Then repair master sends row(pk=1, ck=1, col=10) and row(pk=1, ck=1,
col=30) to follower1, follower1 will write the row with the same
pk=1, ck=1 twice, which violates uniqueness constraints.

To fix, we apply the row with same pk and ck into the previous row.
We only needs this on repair follower because the rows can come from
multiple nodes. While on repair master, we have a sstable writer per
follower, so the rows feed into sstable writer can come from only a
single node.

Tests: repair_additional_test.py:RepairAdditionalTest.repair_same_row_diff_value_3nodes_test
Fixes: #4510
Message-Id: <cb4fbba1e10fb0018116ffe5649c0870cda34575.1560405722.git.asias@scylladb.com>
(cherry picked from commit 9079790f85)
2019-06-16 10:23:58 +03:00
Asias He
46498e77b8 repair: Allow repair_row to initialize partially
On repair follower node, only decorated_key_with_hash and the
mutation_fragment inside repair_row are used in apply_rows() to apply
the rows to disk. Allow repair_row to initialize partially and throw if
the uninitialized member is accessed to be safe.
Message-Id: <b4e5cc050c11b1bafcf997076a3e32f20d059045.1560405722.git.asias@scylladb.com>

(cherry picked from commit 912ce53fc5)
2019-06-16 10:23:50 +03:00
Piotr Jastrzebski
440f33709e sstables: distinguish empty and missing cellpath
Before this patch mc sstables writer was ignoring
empty cellpaths. This is a wrong behaviour because
it is possible to have empty key in a map. In such case,
our writer creats a wrong sstable that we can't read back.
This is becaus a complex cell expects cellpath for each
simple cell it has. When writer ignores empty cellpath
it writes nothing and instead it should write a length
of zero to the file so that we know there's an empty cellpath.

Fixes #4533

Tests: unit(release)

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <46242906c691a56a915ca5994b36baf87ee633b7.1560532790.git.piotr@scylladb.com>
(cherry picked from commit a41c9763a9)
2019-06-16 09:04:24 +03:00
Pekka Enberg
34696e1582 dist/docker: Switch to 3.1 release repository 2019-06-14 08:10:02 +03:00
Takuya ASADA
43bb290705 dist/docker/redhat: change user of scylla services to 'scylla'
On branch-3.1 / master, we are getting following error:

ERROR 2019-06-11 10:58:49,156 [shard 0] database - /var/lib/scylla/data: File not owned by current euid: 0. Owner is: 999
ERROR 2019-06-11 10:58:49,156 [shard 0] init - Failed owner and mode verification: std::runtime_error (File not owned by current euid: 0. Owner is: 999)
ERROR 2019-06-11 10:58:49,156 [shard 0] database - /var/lib/scylla/hints: File not owned by current euid: 0. Owner is: 999
ERROR 2019-06-11 10:58:49,156 [shard 0] init - Failed owner and mode verification: std::runtime_error (File not owned by current euid: 0. Owner is: 999)
ERROR 2019-06-11 10:58:49,156 [shard 0] database - /var/lib/scylla/commitlog: File not owned by current euid: 0. Owner is: 999
ERROR 2019-06-11 10:58:49,156 [shard 0] init - Failed owner and mode verification: std::runtime_error (File not owned by current euid: 0. Owner is: 999)
ERROR 2019-06-11 10:58:49,156 [shard 0] database - /var/lib/scylla/view_hints: File not owned by current euid: 0. Owner is: 999
ERROR 2019-06-11 10:58:49,156 [shard 0] init - Failed owner and mode verification: std::runtime_error (File not owned by current euid: 0. Owner is: 999)

It seems like owner verification of data directory fails because
scylla-server process is running in root but data directory owned by
scylla, so we should run services as scylla user.

Fixes #4536
Message-Id: <20190611113142.23599-1-syuu@scylladb.com>

(cherry picked from commit b1226fb15a)
2019-06-14 08:02:45 +03:00
Calle Wilund
53980816de api.hh: Fix bool parsing in req_param
Fixes #4525

req_param uses boost::lexical cast to convert text->var.
However, lexical_cast does not handle textual booleans,
thus param=true causes not only wrong values, but
exceptions.

Message-Id: <20190610140511.15478-1-calle@scylladb.com>
(cherry picked from commit 26702612f3)
2019-06-13 11:56:27 +03:00
Vlad Zolotarov
c1f4617530 fix_system_distributed_tables.py: declare the 'port' argument as 'int'
If a port value passed as a string this makes the cluster.connect() to
fail with Python3.4.

Let's fix this by explicitly declaring a 'port' argument as 'int'.

Fixes #4527

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <20190606133321.28225-1-vladz@scylladb.com>
(cherry picked from commit 20a610f6bc)
2019-06-13 11:45:54 +03:00
Raphael S. Carvalho
efde9416ed sstables: fix log of failure on large data entry deletion by fixing use-after-move
Fixes #4532.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20190527200828.25339-1-raphaelsc@scylladb.com>
(cherry picked from commit 62aa0ea3fa)
2019-06-13 11:44:22 +03:00
Hagit Segev
224f9cee7e release: prepare for 3.1.0.rc1 2019-06-06 18:16:06 +03:00
Hagit Segev
cd1d13f805 release: prepare for 3.1.rc1 2019-06-06 15:32:54 +03:00
Pekka Enberg
899291bc9b relocate_python_scripts.py: Fix node-exporter install on Debian variants
The relocatable Python is built from Fedora packages. Unfortunately TLS
certificates are in a different location on Debian variants, which
causes "node_exporter_install" to fail as follows:

  Traceback (most recent call last):
    File "/usr/lib/scylla/libexec/node_exporter_install", line 58, in <module>
      data = curl('https://github.com/prometheus/node_exporter/releases/download/v{version}/node_exporter-{version}.linux-amd64.tar.gz'.format(version=VERSION), byte=True)
    File "/usr/lib/scylla/scylla_util.py", line 40, in curl
      with urllib.request.urlopen(req) as res:
    File "/opt/scylladb/python3/lib64/python3.7/urllib/request.py", line 222, in urlopen
      return opener.open(url, data, timeout)
    File "/opt/scylladb/python3/lib64/python3.7/urllib/request.py", line 525, in open
      response = self._open(req, data)
    File "/opt/scylladb/python3/lib64/python3.7/urllib/request.py", line 543, in _open
      '_open', req)
    File "/opt/scylladb/python3/lib64/python3.7/urllib/request.py", line 503, in _call_chain
      result = func(*args)
    File "/opt/scylladb/python3/lib64/python3.7/urllib/request.py", line 1360, in https_open
      context=self._context, check_hostname=self._check_hostname)
    File "/opt/scylladb/python3/lib64/python3.7/urllib/request.py", line 1319, in do_open
      raise URLError(err)
  urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1056)>
  Unable to retrieve version information
  node exporter setup failed.

Fix the problem by overriding the SSL_CERT_FILE environment variable to
point to the correct location of the TLS bundle.

Message-Id: <20190604175434.24534-1-penberg@scylladb.com>
(cherry picked from commit eb00095bca)
2019-06-05 22:20:06 +03:00
Paweł Dziepak
4130973f51 tests/perf_fast_forward: report average number of aio operations
perf_fast_forward is used to detect performance regressions. The two
main metrics used for this are fargments per second and the number of
the IO operations. The former is a median of a several runs, but the
latter is just the actual number of asynchronous IO operations performed
in the run that happened to be picked as a median frag/s-wise. There's
no always a direct correlation between frag/s and aio and the latter can
vary which makes the latter hard to compare.

In order to make this easier a new metric was introduced: "average aio"
which reports the average number of asynchronous IO operations performed
in a run. This should produce much more stable results and therefore
make the comparison more meaningful.
Message-Id: <20190430134401.19238-1-pdziepak@scylladb.com>

(cherry picked from commit 51e98e0e11)
2019-06-05 16:36:09 +03:00
Takuya ASADA
24e2c72888 dist/debian: support relocatable python3 on Debian variants
Unlike CentOS, Debian variants has python3 package on official repository,
so we don't have to use relocatable python3 on these distributions.
However, official python3 version is different on each distribution, we may
have issue because of that.
Also, our scripts and packaging implementation are becoming presuppose
existence of relocatable python3, it is causing issue on Debian
variants.

Switching to relocatable python3 on Debian variants avoid these issues,
it will easier to manage Scylla python3 environments accross multiple
distributions.

Fixes #4495

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190531112707.20082-1-syuu@scylladb.com>
(cherry picked from commit 25112408a7)
2019-06-03 17:42:26 +03:00
Raphael S. Carvalho
69cc7d89c8 compaction: do not unconditionally delete a new sstable in interrupted compaction
After incremental compaction, new sstables may have already replaced old
sstables at any point. Meaning that a new sstable is in-use by table and
a old sstable is already deleted when compaction itself is UNFINISHED.
Therefore, we should *NEVER* delete a new sstable unconditionally for an
interrupted compaction, or data loss could happen.
To fix it, we'll only delete new sstables that didn't replace anything
in the table, meaning they are unused.

Found the problem while auditting the code.

Fixes #4479.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20190506134723.16639-1-raphaelsc@scylladb.com>
(cherry picked from commit ef5681486f)
2019-06-03 16:00:20 +03:00
Avi Kivity
5f6c5d566a Revert "dist/debian: support relocatable python3 on Debian variants"
This reverts commit 1fbab82553. Breaks build_deb.sh:

18:39:56 +	seastar/scripts/perftune.py seastar/scripts/seastar-addr2line seastar/scripts/perftune.py
18:39:56 Traceback (most recent call last):
18:39:56   File "./relocate_python_scripts.py", line 116, in <module>
18:39:56     fixup_scripts(archive, args.scripts)
18:39:56   File "./relocate_python_scripts.py", line 104, in fixup_scripts
18:39:56     fixup_script(output, script)
18:39:56   File "./relocate_python_scripts.py", line 79, in fixup_script
18:39:56     orig_stat = os.stat(script)
18:39:56 FileNotFoundError: [Errno 2] No such file or directory: '/data/jenkins/workspace/scylla-master/unified-deb/scylla/build/debian/scylla-package/+'
18:39:56 make[1]: *** [debian/rules:19: override_dh_auto_install] Error 1
2019-05-29 14:00:29 +03:00
Takuya ASADA
f32aea3834 reloc/python3: add license files on relocatable python3 package
It's better to have license files on our python3 distribution.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190516094329.13273-1-syuu@scylladb.com>
(cherry picked from commit 4b08a3f906)
2019-05-29 13:59:38 +03:00
Takuya ASADA
933260cb53 dist/ami: output scylla version information to AMI tags and description
Users may want to know which version of packages are used for the AMI,
it's good to have it on AMI tags and description.

To do this, we need to download .rpm from specified .repo, extract
version information from .rpm.

Fixes #4499

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190520123924.14060-2-syuu@scylladb.com>
(cherry picked from commit a55330a10b)
2019-05-29 13:59:38 +03:00
Takuya ASADA
f8ff0e1993 dist/ami: build scylla-python3 when specified --localrpm
Since we switched to relocatable python3, we need to build it for AMI too.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190520123924.14060-1-syuu@scylladb.com>
(cherry picked from commit abe44c28c5)
2019-05-29 13:59:38 +03:00
Takuya ASADA
1fbab82553 dist/debian: support relocatable python3 on Debian variants
Unlike CentOS, Debian variants has python3 package on official repository,
so we don't have to use relocatable python3 on these distributions.
However, official python3 version is different on each distribution, we may
have issue because of that.
Also, our scripts and packaging implementation are becoming presuppose
existence of relocatable python3, it is causing issue on Debian
variants.

Switching to relocatable python3 on Debian variants avoid these issues,
it will easier to manage Scylla python3 environments accross multiple
distributions.

Fixes #4495

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190526105138.677-1-syuu@scylladb.com>
(cherry picked from commit 4d119cbd6d)
2019-05-26 15:40:56 +03:00
Paweł Dziepak
c664615960 Merge "Fix empty counters handling in MC" from Piotr
"
Before this patchset empty counters were incorrectly persisted for
MC format. No value was written to disk for them. The correct way
is to still write a header that informs the counter is empty.

We also need to make sure that reading wrongly persisted empty
counters works because customers may have sstables with wrongly
persisted empty counters.

Fixes #4363
"

* 'haaawk/4363/v3' of github.com:scylladb/seastar-dev:
  sstables: add test for empty counters
  docs: add CorrectEmptyCounters to sstable-scylla-format
  sstables: Add a feature for empty counters in Scylla.db.
  sstables: Write header for empty counters
  sstables: Remove unused variables in make_counter_cell
  sstables: Handle empty counter value in read path

(cherry picked from commit 899ebe483a)
2019-05-23 22:15:00 +03:00
Benny Halevy
6a682dc5a2 cql3: select_statement: provide default initializer for parameters::_bypass_cache
Fixes #4503

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20190521143300.22753-1-bhalevy@scylladb.com>
(cherry picked from commit fae4ca756c)
2019-05-23 08:28:01 +03:00
Gleb Natapov
c1271d08d3 cache_hitrate_calculator: make cache hitrate calculation preemptable
The calculation is done in a non preemptable loop over all tables, so if
numbers of tables is very large it may take a while since we also build
a string for gossiper state. Make the loop preemtable and also make
the string calculation more efficient by preallocating memory for it.
Message-Id: <20190516132748.6469-3-gleb@scylladb.com>

(cherry picked from commit 31bf4cfb5e)
2019-05-17 12:38:34 +02:00
Gleb Natapov
0d5c2501b3 cache_hitrate_calculator: do not copy stats map for each cpu
invoke_on_all() copies provided function for each shard it is executed
on, so by moving stats map into the capture we copy it for each shard
too. Avoid it by putting it into the top level object which is already
captured by reference.
Message-Id: <20190516132748.6469-2-gleb@scylladb.com>

(cherry picked from commit 4517c56a57)
2019-05-17 12:38:30 +02:00
Asias He
0dd84898ee repair: Fix use after free in remove_repair_meta for repair_metas
We should capture repair_metas so that it will not be freed until the
parallel_for_each is finished.

Fixes: #4333
Tests: repair_additional_test.py:RepairAdditionalTest.repair_kill_1_test
Message-Id: <237b20a359122a639330f9f78c67568410aef014.1557922403.git.asias@scylladb.com>
(cherry picked from commit 51c4f8cc47)
2019-05-16 11:12:09 +03:00
Avi Kivity
d568270d7f Merge "gc_clock: Fix hashing to be backwards-compatible" from Tomasz
"
Commit d0f9e00 changed the representation of the gc_clock::duration
from int32_t to int64_t.

Mutation hashing uses appending_hash<gc_clock::time_point>, which by
default feeds duration::count() into the hasher. duration::rep changed
from int32_t to int64_t, which changes the value of the hash.

This affects schema digest and query digests, resulting in mismatches
between nodes during a rolling upgrade.

Fixes #4460.
Refs #4485.
"

* tag 'fix-gc_clock-digest-v2.1' of github.com:tgrabiec/scylla:
  tests: Add test which verifies that schema digest stays the same
  tests: Add sstables for the schema digest test
  schema_tables, storage_service: Make schema digest insensitive to expired tombstones in empty partition
  db/schema_tables: Move feed_hash_for_schema_digest() to .cc file
  hashing: Introduce type-erased interface for the hasher
  hashing: Introduce C++ concept for the hasher
  hashers: Rename hasher to cryptopp_hasher
  gc_clock: Fix hashing to be backwards-compatible

(cherry picked from commit 82b91c1511)
2019-05-15 09:48:05 +03:00
Takuya ASADA
78c57f18c4 dist/ami: fix wrong path of SCYLLA-PRODUCT-FILE
Since other build_*.sh are for running inside extracted relocatable
package, they have SCYLLA-PRODUCT-FILE on top of the directory,
but build_ami.sh is not running in such condition, we need to run
SCYLLA-VERSION-GEN first, then refer to build/SCYLLA-PRODUCT-FILE.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20190509110621.27468-1-syuu@scylladb.com>
(cherry picked from commit 19a973cd05)
2019-05-13 16:45:25 +03:00
Glauber Costa
ce27949797 Support AWS i3en instances
AWS just released their new instances, the i3en instances.  The instance
is verified already to work well with scylla, the only adjustments that
we need is advertise that we support it, and pre-fill the disk
information according to the performance numbers obtained by running the
instance.

Fixes #4486
Branches: 3.1

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20190508170831.6003-1-glauber@scylladb.com>
(cherry picked from commit a23531ebd5)
2019-05-13 16:45:25 +03:00
Hagit Segev
6b47e23d29 release: prepare for 3.1.0.rc0 2019-05-13 15:03:34 +03:00
Piotr Sarna
1cb6cc0ac4 Revert "view: cache is_index for view pointer"
This reverts commit dbe8491655.
Caching the value was not done in a correct manner, which resulted
in longevity tests failures.

Fixes #4478

Branches: 3.1

Message-Id: <762ca9db618ca2ed7702372fbafe8ecd193dcf4d.1557129652.git.sarna@scylladb.com>
(cherry picked from commit cf8d2a5141)
2019-05-08 11:14:11 +03:00
Benny Halevy
67435eff15 time_window_backlog_tracker: fix use after free
Fixes #4465

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20190430094209.13958-1-bhalevy@scylladb.com>
(cherry picked from commit 3a2fa82d6e)
2019-05-06 09:38:08 +03:00
Gleb Natapov
086ce13fb9 batchlog_manager: fix array out of bound access
endpoint_filter() function assumes that each bucket of
std::unordered_multimap contains elements with the same key only, so
its size can be used to know how many elements with a particular key
are there.  But this is not the case, elements with multiple keys may
share a bucket. Fix it by counting keys in other way.

Fixes #3229

Message-Id: <20190501133127.GE21208@scylladb.com>
(cherry picked from commit 95c6d19f6c)
2019-05-03 11:59:09 +03:00
Glauber Costa
eb9a8f4442 scylla_setup: respect user's decision not to call housekeeping
The setup script asks the user whether or not housekeeping should
be called, and in the first time the script is executed this decision
is respected.

However if the script is invoked again, that decision is not respected.

This is because the check has the form:

 if (housekeeping_cfg_file_exists) {
    version_check = ask_user();
 }
 if (version_check) { do_version_check() } else { dont_do_it() }

When it should have the form:

 if (housekeeping_cfg_file_exists) {
    version_check = ask_user();
    if (version_check) { do_version_check() } else { dont_do_it() }
 }

(Thanks python)

This is problematic in systems that are not connected to the internet, since
housekeeping will fail to run and crash the setup script.

Fixes #4462

Branches: master, branch-3.1
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20190502034211.18435-1-glauber@scylladb.com>
(cherry picked from commit 47d04e49e8)
2019-05-03 09:57:31 +03:00
Glauber Costa
178fb5fe5f make scylla_util OS detection robust against empty lines
Newer versions of RHEL ship the os-release file with newlines in the
end, which our script was not prepared to handle. As such, scylla_setup
would fail.

This patch makes our OS detection robust against that.

Fixes #4473

Branches: master, branch-3.1
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <20190502152224.31307-1-glauber@scylladb.com>
(cherry picked from commit 99c00547ad)
2019-05-03 09:57:21 +03:00
276 changed files with 2337 additions and 324 deletions

2
.gitmodules vendored
View File

@@ -1,6 +1,6 @@
[submodule "seastar"]
path = seastar
url = ../seastar
url = ../scylla-seastar
ignore = dirty
[submodule "swagger-ui"]
path = swagger-ui

View File

@@ -1,7 +1,7 @@
#!/bin/sh
PRODUCT=scylla
VERSION=666.development
VERSION=3.1.0.rc3
if test -f version
then

View File

@@ -22,6 +22,7 @@
#pragma once
#include <seastar/json/json_elements.hh>
#include <type_traits>
#include <boost/lexical_cast.hpp>
#include <boost/algorithm/string/split.hpp>
#include <boost/algorithm/string/classification.hpp>
@@ -231,7 +232,22 @@ public:
return;
}
try {
value = T{boost::lexical_cast<Base>(param)};
// boost::lexical_cast does not use boolalpha. Converting a
// true/false throws exceptions. We don't want that.
if constexpr (std::is_same_v<Base, bool>) {
// Cannot use boolalpha because we (probably) want to
// accept 1 and 0 as well as true and false. And True. And fAlse.
std::transform(param.begin(), param.end(), param.begin(), ::tolower);
if (param == "true" || param == "1") {
value = T(true);
} else if (param == "false" || param == "0") {
value = T(false);
} else {
throw boost::bad_lexical_cast{};
}
} else {
value = T{boost::lexical_cast<Base>(param)};
}
} catch (boost::bad_lexical_cast&) {
throw bad_param_exception(format("{} ({}): type error - should be {}", name, param, boost::units::detail::demangle(typeid(Base).name())));
}

View File

@@ -170,7 +170,9 @@ future<> service::start() {
return once_among_shards([this] {
return create_keyspace_if_missing();
}).then([this] {
return when_all_succeed(_role_manager->start(), _authorizer->start(), _authenticator->start());
return _role_manager->start().then([this] {
return when_all_succeed(_authorizer->start(), _authenticator->start());
});
}).then([this] {
_permissions_cache = std::make_unique<permissions_cache>(_permissions_cache_config, *this, log);
}).then([this] {

View File

@@ -222,11 +222,9 @@ statement_restrictions::statement_restrictions(database& db,
auto& cf = db.find_column_family(schema);
auto& sim = cf.get_index_manager();
const allow_local_index allow_local(!_partition_key_restrictions->has_unrestricted_components(*_schema) && _partition_key_restrictions->is_all_eq());
bool has_queriable_clustering_column_index = _clustering_columns_restrictions->has_supporting_index(sim, allow_local);
bool has_queriable_pk_index = _partition_key_restrictions->has_supporting_index(sim, allow_local);
bool has_queriable_index = has_queriable_clustering_column_index
|| has_queriable_pk_index
|| _nonprimary_key_restrictions->has_supporting_index(sim, allow_local);
const bool has_queriable_clustering_column_index = _clustering_columns_restrictions->has_supporting_index(sim, allow_local);
const bool has_queriable_pk_index = _partition_key_restrictions->has_supporting_index(sim, allow_local);
const bool has_queriable_regular_index = _nonprimary_key_restrictions->has_supporting_index(sim, allow_local);
// At this point, the select statement if fully constructed, but we still have a few things to validate
process_partition_key_restrictions(has_queriable_pk_index, for_view, allow_filtering);
@@ -286,7 +284,7 @@ statement_restrictions::statement_restrictions(database& db,
}
if (!_nonprimary_key_restrictions->empty()) {
if (has_queriable_index) {
if (has_queriable_regular_index) {
_uses_secondary_indexing = true;
} else if (!allow_filtering) {
throw exceptions::invalid_request_exception("Cannot execute this query as it might involve data filtering and "
@@ -392,8 +390,9 @@ std::vector<const column_definition*> statement_restrictions::get_column_defs_fo
}
}
}
if (_clustering_columns_restrictions->needs_filtering(*_schema)) {
column_id first_filtering_id = _schema->clustering_key_columns().begin()->id +
const bool pk_has_unrestricted_components = _partition_key_restrictions->has_unrestricted_components(*_schema);
if (pk_has_unrestricted_components || _clustering_columns_restrictions->needs_filtering(*_schema)) {
column_id first_filtering_id = pk_has_unrestricted_components ? 0 : _schema->clustering_key_columns().begin()->id +
_clustering_columns_restrictions->num_prefix_columns_that_need_not_be_filtered();
for (auto&& cdef : _clustering_columns_restrictions->get_column_defs()) {
if (cdef->id >= first_filtering_id && !column_uses_indexing(cdef)) {
@@ -507,10 +506,9 @@ bool statement_restrictions::need_filtering() const {
int number_of_filtering_restrictions = _nonprimary_key_restrictions->size();
// If the whole partition key is restricted, it does not imply filtering
if (_partition_key_restrictions->has_unrestricted_components(*_schema) || !_partition_key_restrictions->is_all_eq()) {
number_of_filtering_restrictions += _partition_key_restrictions->size();
if (_clustering_columns_restrictions->has_unrestricted_components(*_schema)) {
number_of_filtering_restrictions += _clustering_columns_restrictions->size() - _clustering_columns_restrictions->prefix_size();
}
number_of_filtering_restrictions += _partition_key_restrictions->size() + _clustering_columns_restrictions->size();
} else if (_clustering_columns_restrictions->has_unrestricted_components(*_schema)) {
number_of_filtering_restrictions += _clustering_columns_restrictions->size() - _clustering_columns_restrictions->prefix_size();
}
return number_of_restricted_columns_for_indexing > 1
|| (number_of_restricted_columns_for_indexing == 0 && _partition_key_restrictions->empty() && !_clustering_columns_restrictions->empty())

View File

@@ -407,7 +407,7 @@ public:
}
bool ck_restrictions_need_filtering() const {
return _clustering_columns_restrictions->needs_filtering(*_schema);
return _partition_key_restrictions->has_unrestricted_components(*_schema) || _clustering_columns_restrictions->needs_filtering(*_schema);
}
/**

View File

@@ -83,6 +83,9 @@ void metadata::maybe_set_paging_state(::shared_ptr<const service::pager::paging_
assert(paging_state);
if (paging_state->get_remaining() > 0) {
set_paging_state(std::move(paging_state));
} else {
_flags.remove<flag::HAS_MORE_PAGES>();
_paging_state = nullptr;
}
}

View File

@@ -76,7 +76,7 @@ public:
const bool _is_distinct;
const bool _allow_filtering;
const bool _is_json;
bool _bypass_cache;
bool _bypass_cache = false;
public:
parameters();
parameters(orderings_type orderings,

View File

@@ -1929,7 +1929,7 @@ flat_mutation_reader make_multishard_streaming_reader(distributed<database>& db,
virtual flat_mutation_reader create_reader(
schema_ptr schema,
const dht::partition_range& range,
const query::partition_slice&,
const query::partition_slice& slice,
const io_priority_class& pc,
tracing::trace_state_ptr,
mutation_reader::forwarding fwd_mr) override {
@@ -1940,7 +1940,7 @@ flat_mutation_reader make_multishard_streaming_reader(distributed<database>& db,
_contexts[shard].read_operation = make_foreign(std::make_unique<utils::phased_barrier::operation>(cf.read_in_progress()));
_contexts[shard].semaphore = &cf.streaming_read_concurrency_semaphore();
return cf.make_streaming_reader(std::move(schema), *_contexts[shard].range, fwd_mr);
return cf.make_streaming_reader(std::move(schema), *_contexts[shard].range, slice, fwd_mr);
}
virtual void destroy_reader(shard_id shard, future<stopped_reader> reader_fut) noexcept override {
reader_fut.then([this, zis = shared_from_this(), shard] (stopped_reader&& reader) mutable {

View File

@@ -458,6 +458,7 @@ private:
// This semaphore ensures that an operation like snapshot won't have its selected
// sstables deleted by compaction in parallel, a race condition which could
// easily result in failure.
// Locking order: must be acquired either independently or after _sstables_lock
seastar::semaphore _sstable_deletion_sem = {1};
// There are situations in which we need to stop writing sstables. Flushers will take
// the read lock, and the ones that wish to stop that process will take the write lock.
@@ -679,8 +680,13 @@ public:
// Single range overload.
flat_mutation_reader make_streaming_reader(schema_ptr schema, const dht::partition_range& range,
const query::partition_slice& slice,
mutation_reader::forwarding fwd_mr = mutation_reader::forwarding::no) const;
flat_mutation_reader make_streaming_reader(schema_ptr schema, const dht::partition_range& range) {
return make_streaming_reader(schema, range, schema->full_slice());
}
sstables::shared_sstable make_streaming_sstable_for_write(std::optional<sstring> subdir = {});
sstables::shared_sstable make_streaming_staging_sstable() {
return make_streaming_sstable_for_write("staging");
@@ -759,13 +765,7 @@ public:
// SSTable writes are now allowed again, and generation is updated to new_generation if != -1
// returns the amount of microseconds elapsed since we disabled writes.
std::chrono::steady_clock::duration enable_sstable_write(int64_t new_generation) {
if (new_generation != -1) {
update_sstables_known_generation(new_generation);
}
_sstables_lock.write_unlock();
return std::chrono::steady_clock::now() - _sstable_writes_disabled_at;
}
std::chrono::steady_clock::duration enable_sstable_write(int64_t new_generation);
// Make sure the generation numbers are sequential, starting from "start".
// Generations before "start" are left untouched.

View File

@@ -396,10 +396,8 @@ std::unordered_set<gms::inet_address> db::batchlog_manager::endpoint_filter(cons
// grab a random member of up to two racks
for (auto& rack : racks) {
auto rack_members = validated.bucket(rack);
auto n = validated.bucket_size(rack_members);
auto cpy = boost::copy_range<std::vector<gms::inet_address>>(validated.equal_range(rack) | boost::adaptors::map_values);
std::uniform_int_distribution<size_t> rdist(0, n - 1);
std::uniform_int_distribution<size_t> rdist(0, cpy.size() - 1);
result.emplace(cpy[rdist(_e1)]);
}

View File

@@ -26,11 +26,17 @@
namespace db {
enum class schema_feature {
VIEW_VIRTUAL_COLUMNS
VIEW_VIRTUAL_COLUMNS,
// When set, the schema digest is calcualted in a way such that it doesn't change after all
// tombstones in an empty partition expire.
// See https://github.com/scylladb/scylla/issues/4485
DIGEST_INSENSITIVE_TO_EXPIRY,
};
using schema_features = enum_set<super_enum<schema_feature,
schema_feature::VIEW_VIRTUAL_COLUMNS
schema_feature::VIEW_VIRTUAL_COLUMNS,
schema_feature::DIGEST_INSENSITIVE_TO_EXPIRY
>>;
}

View File

@@ -587,9 +587,9 @@ future<utils::UUID> calculate_schema_digest(distributed<service::storage_proxy>&
return mutations;
});
};
auto reduce = [] (auto& hash, auto&& mutations) {
auto reduce = [features] (auto& hash, auto&& mutations) {
for (const mutation& m : mutations) {
feed_hash_for_schema_digest(hash, m);
feed_hash_for_schema_digest(hash, m, features);
}
};
return do_with(md5_hasher(), all_table_names(features), [features, map, reduce] (auto& hash, auto& tables) {
@@ -778,6 +778,13 @@ mutation compact_for_schema_digest(const mutation& m) {
return m_compacted;
}
void feed_hash_for_schema_digest(hasher& h, const mutation& m, schema_features features) {
auto compacted = compact_for_schema_digest(m);
if (!features.contains<schema_feature::DIGEST_INSENSITIVE_TO_EXPIRY>() || !compacted.partition().empty()) {
feed_hash(h, compact_for_schema_digest(m));
}
}
// Applies deletion of the "version" column to a system_schema.scylla_tables mutation.
static void delete_schema_version(mutation& m) {
if (m.column_family_id() != scylla_tables()->id()) {
@@ -2727,8 +2734,9 @@ namespace legacy {
table_schema_version schema_mutations::digest() const {
md5_hasher h;
db::schema_tables::feed_hash_for_schema_digest(h, _columnfamilies);
db::schema_tables::feed_hash_for_schema_digest(h, _columns);
const db::schema_features no_features;
db::schema_tables::feed_hash_for_schema_digest(h, _columnfamilies, no_features);
db::schema_tables::feed_hash_for_schema_digest(h, _columns, no_features);
return utils::UUID_gen::get_name_UUID(h.finalize());
}

View File

@@ -215,10 +215,7 @@ index_metadata_kind deserialize_index_kind(sstring kind);
mutation compact_for_schema_digest(const mutation& m);
template<typename Hasher>
void feed_hash_for_schema_digest(Hasher& h, const mutation& m) {
feed_hash(h, compact_for_schema_digest(m));
}
void feed_hash_for_schema_digest(hasher&, const mutation&, schema_features);
} // namespace schema_tables
} // namespace db

View File

@@ -143,10 +143,9 @@ void view_info::initialize_base_dependent_fields(const schema& base) {
}
bool view_info::is_index() const {
if (!_is_index) {
_is_index = service::get_local_storage_service().db().local().find_column_family(base_id()).get_index_manager().is_index(_schema);
}
return *_is_index;
//TODO(sarna): result of this call can be cached instead of calling index_manager::is_index every time
column_family& base_cf = service::get_local_storage_service().db().local().find_column_family(base_id());
return base_cf.get_index_manager().is_index(view_ptr(_schema.shared_from_this()));
}
namespace db {

45
dist/ami/build_ami.sh vendored
View File

@@ -1,6 +1,7 @@
#!/bin/bash -e
PRODUCT=$(cat SCYLLA-PRODUCT-FILE)
./SCYLLA-VERSION-GEN
PRODUCT=$(cat build/SCYLLA-PRODUCT-FILE)
if [ ! -e dist/ami/build_ami.sh ]; then
echo "run build_ami.sh in top of scylla dir"
@@ -16,6 +17,7 @@ print_usage() {
exit 1
}
LOCALRPM=0
REPO_FOR_INSTALL=
while [ $# -gt 0 ]; do
case "$1" in
"--localrpm")
@@ -23,10 +25,12 @@ while [ $# -gt 0 ]; do
shift 1
;;
"--repo")
REPO_FOR_INSTALL=$2
INSTALL_ARGS="$INSTALL_ARGS --repo $2"
shift 2
;;
"--repo-for-install")
REPO_FOR_INSTALL=$2
INSTALL_ARGS="$INSTALL_ARGS --repo-for-install $2"
shift 2
;;
@@ -123,6 +127,43 @@ if [ $LOCALRPM -eq 1 ]; then
cd ../..
cp build/$PRODUCT-ami/build/RPMS/noarch/$PRODUCT-ami-`cat build/$PRODUCT-ami/build/SCYLLA-VERSION-FILE`-`cat build/$PRODUCT-ami/build/SCYLLA-RELEASE-FILE`.*.noarch.rpm dist/ami/files/$PRODUCT-ami.noarch.rpm
fi
if [ ! -f dist/ami/files/$PRODUCT-python3.x86_64.rpm ]; then
reloc/python3/build_reloc.sh
reloc/python3/build_rpm.sh
cp build/redhat/RPMS/x86_64/$PRODUCT-python3*.x86_64.rpm dist/ami/files/$PRODUCT-python3.x86_64.rpm
fi
SCYLLA_VERSION=$(rpm -q --qf %{VERSION}-%{RELEASE} dist/ami/files/$PRODUCT.x86_64.rpm || true)
SCYLLA_AMI_VERSION=$(rpm -q --qf %{VERSION}-%{RELEASE} dist/ami/files/$PRODUCT-ami.noarch.rpm || true)
SCYLLA_JMX_VERSION=$(rpm -q --qf %{VERSION}-%{RELEASE} dist/ami/files/$PRODUCT-jmx.noarch.rpm || true)
SCYLLA_TOOLS_VERSION=$(rpm -q --qf %{VERSION}-%{RELEASE} dist/ami/files/$PRODUCT-tools.noarch.rpm || true)
SCYLLA_PYTHON3_VERSION=$(rpm -q --qf %{VERSION}-%{RELEASE} dist/ami/files/$PRODUCT-python3.x86_64.rpm || true)
else
if [ -z "$REPO_FOR_INSTALL" ]; then
print_usage
exit 1
fi
if [ ! -f /usr/bin/yumdownloader ]; then
if is_redhat_variant; then
sudo yum install /usr/bin/yumdownloader
else
sudo apt-get install yum-utils
fi
fi
if [ ! -f /usr/bin/curl ]; then
pkg_install curl
fi
TMPREPO=$(mktemp -u -p /etc/yum.repos.d/ --suffix .repo)
sudo curl -o $TMPREPO $REPO_FOR_INSTALL
rm -rf build/ami_packages
mkdir -p build/ami_packages
yumdownloader --downloaddir build/ami_packages/ $PRODUCT $PRODUCT-kernel-conf $PRODUCT-conf $PRODUCT-server $PRODUCT-debuginfo $PRODUCT-ami $PRODUCT-jmx $PRODUCT-tools-core $PRODUCT-tools $PRODUCT-python3
sudo rm -f $TMPREPO
SCYLLA_VERSION=$(rpm -q --qf %{VERSION}-%{RELEASE} build/ami_packages/$PRODUCT-[0-9]*.rpm || true)
SCYLLA_AMI_VERSION=$(rpm -q --qf %{VERSION}-%{RELEASE} build/ami_packages/$PRODUCT-ami-*.rpm || true)
SCYLLA_JMX_VERSION=$(rpm -q --qf %{VERSION}-%{RELEASE} build/ami_packages/$PRODUCT-jmx-*.rpm || true)
SCYLLA_TOOLS_VERSION=$(rpm -q --qf %{VERSION}-%{RELEASE} build/ami_packages/$PRODUCT-tools-[0-9]*.rpm || true)
SCYLLA_PYTHON3_VERSION=$(rpm -q --qf %{VERSION}-%{RELEASE} build/ami_packages/$PRODUCT-python3-*.rpm || true)
fi
cd dist/ami
@@ -147,4 +188,4 @@ if [ ! -d packer ]; then
cd -
fi
env PACKER_LOG=1 PACKER_LOG_PATH=../../build/ami.log packer/packer build -var-file=variables.json -var install_args="$INSTALL_ARGS" -var region="$REGION" -var source_ami="$AMI" -var ssh_username="$SSH_USERNAME" scylla.json
env PACKER_LOG=1 PACKER_LOG_PATH=../../build/ami.log packer/packer build -var-file=variables.json -var install_args="$INSTALL_ARGS" -var region="$REGION" -var source_ami="$AMI" -var ssh_username="$SSH_USERNAME" -var scylla_version="$SCYLLA_VERSION" -var scylla_ami_version="$SCYLLA_AMI_VERSION" -var scylla_jmx_version="$SCYLLA_JMX_VERSION" -var scylla_tools_version="$SCYLLA_TOOLS_VERSION" -var scylla_python3_version="$SCYLLA_PYTHON3_VERSION" scylla.json

10
dist/ami/scylla.json vendored
View File

@@ -56,7 +56,15 @@
"ssh_username": "{{user `ssh_username`}}",
"subnet_id": "{{user `subnet_id`}}",
"type": "amazon-ebs",
"user_data_file": "user_data.txt"
"user_data_file": "user_data.txt",
"ami_description": "scylla-{{user `scylla_version`}} scylla-ami-{{user `scylla_ami_version`}} scylla-jmx-{{user `scylla_jmx_version`}} scylla-tools-{{user `scylla_tools_version`}} scylla-python3-{{user `scylla_python3_version`}}",
"tags": {
"ScyllaVersion": "{{user `scylla_version`}}",
"ScyllaAMIVersion": "{{user `scylla_ami_version`}}",
"ScyllaJMXVersion": "{{user `scylla_jmx_version`}}",
"ScyllaToolsVersion": "{{user `scylla_tools_version`}}",
"ScyllaPython3Version": "{{user `scylla_python3_version`}}"
}
}
],
"provisioners": [

View File

@@ -60,6 +60,17 @@ if __name__ == "__main__":
disk_properties["read_bandwidth"] = 2015342735 * nr_disks
disk_properties["write_iops"] = 181500 * nr_disks
disk_properties["write_bandwidth"] = 808775652 * nr_disks
elif idata.instance_class() == "i3en":
if idata.instance() in ("i3en.large", "i3.xlarge", "i3en.2xlarge"):
disk_properties["read_iops"] = 46489
disk_properties["read_bandwidth"] = 353437280
disk_properties["write_iops"] = 36680
disk_properties["write_bandwidth"] = 164766656
else:
disk_properties["read_iops"] = 278478 * nr_disks
disk_properties["read_bandwidth"] = 3029172992 * nr_disks
disk_properties["write_iops"] = 221909 * nr_disks
disk_properties["write_bandwidth"] = 1020482432 * nr_disks
elif idata.instance_class() == "i2":
disk_properties["read_iops"] = 64000 * nr_disks
disk_properties["read_bandwidth"] = 507338935 * nr_disks

View File

@@ -252,22 +252,22 @@ if __name__ == '__main__':
if not os.path.exists('/etc/scylla.d/housekeeping.cfg'):
version_check = interactive_ask_service('Do you want to enable Scylla to check if there is a newer version of Scylla available?', 'Yes - start the Scylla-housekeeping service to check for a newer version. This check runs periodically. No - skips this step.', version_check)
args.no_version_check = not version_check
if version_check:
with open('/etc/scylla.d/housekeeping.cfg', 'w') as f:
f.write('[housekeeping]\ncheck-version: True\n')
if is_systemd():
systemd_unit('scylla-housekeeping-daily.timer').unmask()
systemd_unit('scylla-housekeeping-restart.timer').unmask()
else:
with open('/etc/scylla.d/housekeeping.cfg', 'w') as f:
f.write('[housekeeping]\ncheck-version: False\n')
if is_systemd():
hk_daily = systemd_unit('scylla-housekeeping-daily.timer')
hk_daily.mask()
hk_daily.stop()
hk_restart = systemd_unit('scylla-housekeeping-restart.timer')
hk_restart.mask()
hk_restart.stop()
if version_check:
with open('/etc/scylla.d/housekeeping.cfg', 'w') as f:
f.write('[housekeeping]\ncheck-version: True\n')
if is_systemd():
systemd_unit('scylla-housekeeping-daily.timer').unmask()
systemd_unit('scylla-housekeeping-restart.timer').unmask()
else:
with open('/etc/scylla.d/housekeeping.cfg', 'w') as f:
f.write('[housekeeping]\ncheck-version: False\n')
if is_systemd():
hk_daily = systemd_unit('scylla-housekeeping-daily.timer')
hk_daily.mask()
hk_daily.stop()
hk_restart = systemd_unit('scylla-housekeeping-restart.timer')
hk_restart.mask()
hk_restart.stop()
cur_version=out('scylla --version', exception=False)
if len(cur_version) > 0:

View File

@@ -119,7 +119,7 @@ class aws_instance:
return self._type.split(".")[0]
def is_supported_instance_class(self):
if self.instance_class() in ['i2', 'i3']:
if self.instance_class() in ['i2', 'i3', 'i3en']:
return True
return False
@@ -128,7 +128,7 @@ class aws_instance:
instance_size = self.instance_size()
if instance_class in ['c3', 'c4', 'd2', 'i2', 'r3']:
return 'ixgbevf'
if instance_class in ['c5', 'c5d', 'f1', 'g3', 'h1', 'i3', 'm5', 'm5d', 'p2', 'p3', 'r4', 'x1']:
if instance_class in ['c5', 'c5d', 'f1', 'g3', 'h1', 'i3', 'i3en', 'm5', 'm5d', 'p2', 'p3', 'r4', 'x1']:
return 'ena'
if instance_class == 'm4':
if instance_size == '16xlarge':
@@ -304,7 +304,7 @@ def parse_os_release_line(line):
val = shlex.split(data)[0]
return (id, val.split(' ') if id == 'ID' or id == 'ID_LIKE' else val)
os_release = dict([parse_os_release_line(x) for x in open('/etc/os-release').read().splitlines()])
os_release = dict([parse_os_release_line(x) for x in open('/etc/os-release').read().splitlines() if re.match(r'\w+=', x) ])
def is_debian_variant():
d = os_release['ID_LIKE'] if 'ID_LIKE' in os_release else os_release['ID']

View File

@@ -16,7 +16,7 @@ Conflicts: {{product}}-server (<< 1.1)
Package: {{product}}-server
Architecture: amd64
Depends: ${shlibs:Depends}, ${misc:Depends}, adduser, hwloc-nox, {{product}}-conf, python-yaml, python-urwid, python-requests, curl, util-linux, python3-yaml, python3, uuid-runtime, pciutils, python3-pyudev, gzip, realpath | coreutils, num-utils, file
Depends: ${shlibs:Depends}, ${misc:Depends}, adduser, hwloc-nox, {{product}}-conf, {{product}}-python3, curl, util-linux, uuid-runtime, pciutils, gzip, realpath | coreutils, num-utils, file
Description: Scylla database server binaries
Scylla is a highly scalable, eventually consistent, distributed,
partitioned row DB.

140
dist/debian/python3/build_deb.sh vendored Executable file
View File

@@ -0,0 +1,140 @@
#!/bin/bash -e
PRODUCT=$(cat SCYLLA-PRODUCT-FILE)
. /etc/os-release
print_usage() {
echo "build_deb.sh --reloc-pkg build/release/scylla-python3-package.tar.gz"
echo " --reloc-pkg specify relocatable package path"
exit 1
}
TARGET=stable
RELOC_PKG=
while [ $# -gt 0 ]; do
case "$1" in
"--reloc-pkg")
RELOC_PKG=$2
shift 2
;;
*)
print_usage
;;
esac
done
is_redhat_variant() {
[ -f /etc/redhat-release ]
}
is_debian_variant() {
[ -f /etc/debian_version ]
}
pkg_install() {
if is_redhat_variant; then
sudo yum install -y $1
elif is_debian_variant; then
sudo apt-get install -y $1
else
echo "Requires to install following command: $1"
exit 1
fi
}
if [ ! -e SCYLLA-RELOCATABLE-FILE ]; then
echo "do not directly execute build_deb.sh, use reloc/build_deb.sh instead."
exit 1
fi
if [ "$(arch)" != "x86_64" ]; then
echo "Unsupported architecture: $(arch)"
exit 1
fi
if [ -z "$RELOC_PKG" ]; then
print_usage
exit 1
fi
if [ ! -f "$RELOC_PKG" ]; then
echo "$RELOC_PKG is not found."
exit 1
fi
if [ -e debian ]; then
rm -rf debian
fi
if is_debian_variant; then
sudo apt-get -y update
fi
# this hack is needed since some environment installs 'git-core' package, it's
# subset of the git command and doesn't works for our git-archive-all script.
if is_redhat_variant && [ ! -f /usr/libexec/git-core/git-submodule ]; then
sudo yum install -y git
fi
if [ ! -f /usr/bin/git ]; then
pkg_install git
fi
if [ ! -f /usr/bin/python ]; then
pkg_install python
fi
if [ ! -f /usr/bin/debuild ]; then
pkg_install devscripts
fi
if [ ! -f /usr/bin/dh_testdir ]; then
pkg_install debhelper
fi
if [ ! -f /usr/bin/fakeroot ]; then
pkg_install fakeroot
fi
if [ ! -f /usr/bin/pystache ]; then
if is_redhat_variant; then
sudo yum install -y /usr/bin/pystache
elif is_debian_variant; then
sudo apt-get install -y python-pystache
fi
fi
if [ ! -f /usr/bin/file ]; then
pkg_install file
fi
if is_debian_variant && [ ! -f /usr/share/doc/python-pkg-resources/copyright ]; then
sudo apt-get install -y python-pkg-resources
fi
if [ "$ID" = "ubuntu" ] && [ ! -f /usr/share/keyrings/debian-archive-keyring.gpg ]; then
sudo apt-get install -y debian-archive-keyring
fi
if [ "$ID" = "debian" ] && [ ! -f /usr/share/keyrings/ubuntu-archive-keyring.gpg ]; then
sudo apt-get install -y ubuntu-archive-keyring
fi
if [ -z "$TARGET" ]; then
if is_debian_variant; then
if [ ! -f /usr/bin/lsb_release ]; then
pkg_install lsb-release
fi
TARGET=`lsb_release -c|awk '{print $2}'`
else
echo "Please specify target"
exit 1
fi
fi
RELOC_PKG_FULLPATH=$(readlink -f $RELOC_PKG)
RELOC_PKG_BASENAME=$(basename $RELOC_PKG)
SCYLLA_VERSION=$(cat SCYLLA-VERSION-FILE)
SCYLLA_RELEASE=$(cat SCYLLA-RELEASE-FILE)
ln -fv $RELOC_PKG_FULLPATH ../$PRODUCT-python3_$SCYLLA_VERSION-$SCYLLA_RELEASE.orig.tar.gz
cp -al dist/debian/python3/debian debian
if [ "$PRODUCT" != "scylla" ]; then
for i in debian/scylla-*;do
mv $i ${i/scylla-/$PRODUCT-}
done
fi
REVISION="1"
MUSTACHE_DIST="\"debian\": true, \"product\": \"$PRODUCT\", \"$PRODUCT\": true"
pystache dist/debian/python3/changelog.mustache "{ $MUSTACHE_DIST, \"version\": \"$SCYLLA_VERSION\", \"release\": \"$SCYLLA_RELEASE\", \"revision\": \"$REVISION\", \"codename\": \"$TARGET\" }" > debian/changelog
pystache dist/debian/python3/rules.mustache "{ $MUSTACHE_DIST }" > debian/rules
pystache dist/debian/python3/control.mustache "{ $MUSTACHE_DIST }" > debian/control
chmod a+rx debian/rules
debuild -rfakeroot -us -uc

View File

@@ -0,0 +1,5 @@
{{product}}-python3 ({{version}}-{{release}}-{{revision}}) {{codename}}; urgency=medium
* Initial release.
-- Takuya ASADA <syuu@scylladb.com> Mon, 24 Aug 2015 09:22:55 +0000

16
dist/debian/python3/control.mustache vendored Normal file
View File

@@ -0,0 +1,16 @@
Source: {{product}}-python3
Maintainer: Takuya ASADA <syuu@scylladb.com>
Homepage: http://scylladb.com
Section: python
Priority: optional
X-Python3-Version: >= 3.4
Standards-Version: 3.9.5
Package: {{product}}-python3
Architecture: amd64
Description: A standalone python3 interpreter that can be moved around different Linux machines
This is a self-contained python interpreter that can be moved around
different Linux machines as long as they run a new enough kernel (where
new enough is defined by whichever Python module uses any kernel
functionality). All shared libraries needed for the interpreter to
operate are shipped with it.

1
dist/debian/python3/debian/compat vendored Normal file
View File

@@ -0,0 +1 @@
9

995
dist/debian/python3/debian/copyright vendored Normal file
View File

@@ -0,0 +1,995 @@
This package was put together by Klee Dienes <klee@debian.org> from
sources from ftp.python.org:/pub/python, based on the Debianization by
the previous maintainers Bernd S. Brentrup <bsb@uni-muenster.de> and
Bruce Perens. Current maintainer is Matthias Klose <doko@debian.org>.
It was downloaded from http://python.org/
Copyright:
Upstream Author: Guido van Rossum <guido@cwi.nl> and others.
License:
The following text includes the Python license and licenses and
acknowledgements for incorporated software. The licenses can be read
in the HTML and texinfo versions of the documentation as well, after
installing the pythonx.y-doc package. Licenses for files not licensed
under the Python Licenses are found at the end of this file.
Python License
==============
A. HISTORY OF THE SOFTWARE
==========================
Python was created in the early 1990s by Guido van Rossum at Stichting
Mathematisch Centrum (CWI, see http://www.cwi.nl) in the Netherlands
as a successor of a language called ABC. Guido remains Python's
principal author, although it includes many contributions from others.
In 1995, Guido continued his work on Python at the Corporation for
National Research Initiatives (CNRI, see http://www.cnri.reston.va.us)
in Reston, Virginia where he released several versions of the
software.
In May 2000, Guido and the Python core development team moved to
BeOpen.com to form the BeOpen PythonLabs team. In October of the same
year, the PythonLabs team moved to Digital Creations (now Zope
Corporation, see http://www.zope.com). In 2001, the Python Software
Foundation (PSF, see http://www.python.org/psf/) was formed, a
non-profit organization created specifically to own Python-related
Intellectual Property. Zope Corporation is a sponsoring member of
the PSF.
All Python releases are Open Source (see http://www.opensource.org for
the Open Source Definition). Historically, most, but not all, Python
releases have also been GPL-compatible; the table below summarizes
the various releases.
Release Derived Year Owner GPL-
from compatible? (1)
0.9.0 thru 1.2 1991-1995 CWI yes
1.3 thru 1.5.2 1.2 1995-1999 CNRI yes
1.6 1.5.2 2000 CNRI no
2.0 1.6 2000 BeOpen.com no
1.6.1 1.6 2001 CNRI yes (2)
2.1 2.0+1.6.1 2001 PSF no
2.0.1 2.0+1.6.1 2001 PSF yes
2.1.1 2.1+2.0.1 2001 PSF yes
2.2 2.1.1 2001 PSF yes
2.1.2 2.1.1 2002 PSF yes
2.1.3 2.1.2 2002 PSF yes
2.2 and above 2.1.1 2001-now PSF yes
Footnotes:
(1) GPL-compatible doesn't mean that we're distributing Python under
the GPL. All Python licenses, unlike the GPL, let you distribute
a modified version without making your changes open source. The
GPL-compatible licenses make it possible to combine Python with
other software that is released under the GPL; the others don't.
(2) According to Richard Stallman, 1.6.1 is not GPL-compatible,
because its license has a choice of law clause. According to
CNRI, however, Stallman's lawyer has told CNRI's lawyer that 1.6.1
is "not incompatible" with the GPL.
Thanks to the many outside volunteers who have worked under Guido's
direction to make these releases possible.
B. TERMS AND CONDITIONS FOR ACCESSING OR OTHERWISE USING PYTHON
===============================================================
PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2
--------------------------------------------
1. This LICENSE AGREEMENT is between the Python Software Foundation
("PSF"), and the Individual or Organization ("Licensee") accessing and
otherwise using this software ("Python") in source or binary form and
its associated documentation.
2. Subject to the terms and conditions of this License Agreement, PSF
hereby grants Licensee a nonexclusive, royalty-free, world-wide
license to reproduce, analyze, test, perform and/or display publicly,
prepare derivative works, distribute, and otherwise use Python alone
or in any derivative version, provided, however, that PSF's License
Agreement and PSF's notice of copyright, i.e., "Copyright (c) 2001,
2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012,
2013, 2014 Python Software Foundation; All Rights Reserved" are
retained in Python alone or in any derivative version prepared by
Licensee.
3. In the event Licensee prepares a derivative work that is based on
or incorporates Python or any part thereof, and wants to make
the derivative work available to others as provided herein, then
Licensee hereby agrees to include in any such work a brief summary of
the changes made to Python.
4. PSF is making Python available to Licensee on an "AS IS"
basis. PSF MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PSF MAKES NO AND
DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS
FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON WILL NOT
INFRINGE ANY THIRD PARTY RIGHTS.
5. PSF SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON
FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS
A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON,
OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
6. This License Agreement will automatically terminate upon a material
breach of its terms and conditions.
7. Nothing in this License Agreement shall be deemed to create any
relationship of agency, partnership, or joint venture between PSF and
Licensee. This License Agreement does not grant permission to use PSF
trademarks or trade name in a trademark sense to endorse or promote
products or services of Licensee, or any third party.
8. By copying, installing or otherwise using Python, Licensee
agrees to be bound by the terms and conditions of this License
Agreement.
BEOPEN.COM LICENSE AGREEMENT FOR PYTHON 2.0
-------------------------------------------
BEOPEN PYTHON OPEN SOURCE LICENSE AGREEMENT VERSION 1
1. This LICENSE AGREEMENT is between BeOpen.com ("BeOpen"), having an
office at 160 Saratoga Avenue, Santa Clara, CA 95051, and the
Individual or Organization ("Licensee") accessing and otherwise using
this software in source or binary form and its associated
documentation ("the Software").
2. Subject to the terms and conditions of this BeOpen Python License
Agreement, BeOpen hereby grants Licensee a non-exclusive,
royalty-free, world-wide license to reproduce, analyze, test, perform
and/or display publicly, prepare derivative works, distribute, and
otherwise use the Software alone or in any derivative version,
provided, however, that the BeOpen Python License is retained in the
Software, alone or in any derivative version prepared by Licensee.
3. BeOpen is making the Software available to Licensee on an "AS IS"
basis. BEOPEN MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, BEOPEN MAKES NO AND
DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS
FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE SOFTWARE WILL NOT
INFRINGE ANY THIRD PARTY RIGHTS.
4. BEOPEN SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF THE
SOFTWARE FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS
AS A RESULT OF USING, MODIFYING OR DISTRIBUTING THE SOFTWARE, OR ANY
DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
5. This License Agreement will automatically terminate upon a material
breach of its terms and conditions.
6. This License Agreement shall be governed by and interpreted in all
respects by the law of the State of California, excluding conflict of
law provisions. Nothing in this License Agreement shall be deemed to
create any relationship of agency, partnership, or joint venture
between BeOpen and Licensee. This License Agreement does not grant
permission to use BeOpen trademarks or trade names in a trademark
sense to endorse or promote products or services of Licensee, or any
third party. As an exception, the "BeOpen Python" logos available at
http://www.pythonlabs.com/logos.html may be used according to the
permissions granted on that web page.
7. By copying, installing or otherwise using the software, Licensee
agrees to be bound by the terms and conditions of this License
Agreement.
CNRI LICENSE AGREEMENT FOR PYTHON 1.6.1
---------------------------------------
1. This LICENSE AGREEMENT is between the Corporation for National
Research Initiatives, having an office at 1895 Preston White Drive,
Reston, VA 20191 ("CNRI"), and the Individual or Organization
("Licensee") accessing and otherwise using Python 1.6.1 software in
source or binary form and its associated documentation.
2. Subject to the terms and conditions of this License Agreement, CNRI
hereby grants Licensee a nonexclusive, royalty-free, world-wide
license to reproduce, analyze, test, perform and/or display publicly,
prepare derivative works, distribute, and otherwise use Python 1.6.1
alone or in any derivative version, provided, however, that CNRI's
License Agreement and CNRI's notice of copyright, i.e., "Copyright (c)
1995-2001 Corporation for National Research Initiatives; All Rights
Reserved" are retained in Python 1.6.1 alone or in any derivative
version prepared by Licensee. Alternately, in lieu of CNRI's License
Agreement, Licensee may substitute the following text (omitting the
quotes): "Python 1.6.1 is made available subject to the terms and
conditions in CNRI's License Agreement. This Agreement together with
Python 1.6.1 may be located on the Internet using the following
unique, persistent identifier (known as a handle): 1895.22/1013. This
Agreement may also be obtained from a proxy server on the Internet
using the following URL: http://hdl.handle.net/1895.22/1013".
3. In the event Licensee prepares a derivative work that is based on
or incorporates Python 1.6.1 or any part thereof, and wants to make
the derivative work available to others as provided herein, then
Licensee hereby agrees to include in any such work a brief summary of
the changes made to Python 1.6.1.
4. CNRI is making Python 1.6.1 available to Licensee on an "AS IS"
basis. CNRI MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, CNRI MAKES NO AND
DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS
FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON 1.6.1 WILL NOT
INFRINGE ANY THIRD PARTY RIGHTS.
5. CNRI SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON
1.6.1 FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS
A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON 1.6.1,
OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
6. This License Agreement will automatically terminate upon a material
breach of its terms and conditions.
7. This License Agreement shall be governed by the federal
intellectual property law of the United States, including without
limitation the federal copyright law, and, to the extent such
U.S. federal law does not apply, by the law of the Commonwealth of
Virginia, excluding Virginia's conflict of law provisions.
Notwithstanding the foregoing, with regard to derivative works based
on Python 1.6.1 that incorporate non-separable material that was
previously distributed under the GNU General Public License (GPL), the
law of the Commonwealth of Virginia shall govern this License
Agreement only as to issues arising under or with respect to
Paragraphs 4, 5, and 7 of this License Agreement. Nothing in this
License Agreement shall be deemed to create any relationship of
agency, partnership, or joint venture between CNRI and Licensee. This
License Agreement does not grant permission to use CNRI trademarks or
trade name in a trademark sense to endorse or promote products or
services of Licensee, or any third party.
8. By clicking on the "ACCEPT" button where indicated, or by copying,
installing or otherwise using Python 1.6.1, Licensee agrees to be
bound by the terms and conditions of this License Agreement.
ACCEPT
CWI LICENSE AGREEMENT FOR PYTHON 0.9.0 THROUGH 1.2
--------------------------------------------------
Copyright (c) 1991 - 1995, Stichting Mathematisch Centrum Amsterdam,
The Netherlands. All rights reserved.
Permission to use, copy, modify, and distribute this software and its
documentation for any purpose and without fee is hereby granted,
provided that the above copyright notice appear in all copies and that
both that copyright notice and this permission notice appear in
supporting documentation, and that the name of Stichting Mathematisch
Centrum or CWI not be used in advertising or publicity pertaining to
distribution of the software without specific, written prior
permission.
STICHTING MATHEMATISCH CENTRUM DISCLAIMS ALL WARRANTIES WITH REGARD TO
THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM BE LIABLE
FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT
OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
Licenses and Acknowledgements for Incorporated Software
=======================================================
Mersenne Twister
----------------
The `_random' module includes code based on a download from
`http://www.math.keio.ac.jp/~matumoto/MT2002/emt19937ar.html'. The
following are the verbatim comments from the original code:
A C-program for MT19937, with initialization improved 2002/1/26.
Coded by Takuji Nishimura and Makoto Matsumoto.
Before using, initialize the state by using init_genrand(seed)
or init_by_array(init_key, key_length).
Copyright (C) 1997 - 2002, Makoto Matsumoto and Takuji Nishimura,
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. The names of its contributors may not be used to endorse or promote
products derived from this software without specific prior written
permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Any feedback is very welcome.
http://www.math.keio.ac.jp/matumoto/emt.html
email: matumoto@math.keio.ac.jp
Sockets
-------
The `socket' module uses the functions, `getaddrinfo', and
`getnameinfo', which are coded in separate source files from the WIDE
Project, `http://www.wide.ad.jp/about/index.html'.
Copyright (C) 1995, 1996, 1997, and 1998 WIDE Project.
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. Neither the name of the project nor the names of its contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE PROJECT AND CONTRIBUTORS ``AS IS'' AND
GAI_ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE PROJECT OR CONTRIBUTORS BE LIABLE
FOR GAI_ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON GAI_ANY THEORY OF LIABILITY, WHETHER
IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN GAI_ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
OF THE POSSIBILITY OF SUCH DAMAGE.
Floating point exception control
--------------------------------
The source for the `fpectl' module includes the following notice:
---------------------------------------------------------------------
/ Copyright (c) 1996. \
| The Regents of the University of California. |
| All rights reserved. |
| |
| Permission to use, copy, modify, and distribute this software for |
| any purpose without fee is hereby granted, provided that this en- |
| tire notice is included in all copies of any software which is or |
| includes a copy or modification of this software and in all |
| copies of the supporting documentation for such software. |
| |
| This work was produced at the University of California, Lawrence |
| Livermore National Laboratory under contract no. W-7405-ENG-48 |
| between the U.S. Department of Energy and The Regents of the |
| University of California for the operation of UC LLNL. |
| |
| DISCLAIMER |
| |
| This software was prepared as an account of work sponsored by an |
| agency of the United States Government. Neither the United States |
| Government nor the University of California nor any of their em- |
| ployees, makes any warranty, express or implied, or assumes any |
| liability or responsibility for the accuracy, completeness, or |
| usefulness of any information, apparatus, product, or process |
| disclosed, or represents that its use would not infringe |
| privately-owned rights. Reference herein to any specific commer- |
| cial products, process, or service by trade name, trademark, |
| manufacturer, or otherwise, does not necessarily constitute or |
| imply its endorsement, recommendation, or favoring by the United |
| States Government or the University of California. The views and |
| opinions of authors expressed herein do not necessarily state or |
| reflect those of the United States Government or the University |
| of California, and shall not be used for advertising or product |
\ endorsement purposes. /
---------------------------------------------------------------------
Cookie management
-----------------
The `Cookie' module contains the following notice:
Copyright 2000 by Timothy O'Malley <timo@alum.mit.edu>
All Rights Reserved
Permission to use, copy, modify, and distribute this software
and its documentation for any purpose and without fee is hereby
granted, provided that the above copyright notice appear in all
copies and that both that copyright notice and this permission
notice appear in supporting documentation, and that the name of
Timothy O'Malley not be used in advertising or publicity
pertaining to distribution of the software without specific, written
prior permission.
Timothy O'Malley DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS
SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY
AND FITNESS, IN NO EVENT SHALL Timothy O'Malley BE LIABLE FOR
ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS
ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
PERFORMANCE OF THIS SOFTWARE.
Execution tracing
-----------------
The `trace' module contains the following notice:
portions copyright 2001, Autonomous Zones Industries, Inc., all rights...
err... reserved and offered to the public under the terms of the
Python 2.2 license.
Author: Zooko O'Whielacronx
http://zooko.com/
mailto:zooko@zooko.com
Copyright 2000, Mojam Media, Inc., all rights reserved.
Author: Skip Montanaro
Copyright 1999, Bioreason, Inc., all rights reserved.
Author: Andrew Dalke
Copyright 1995-1997, Automatrix, Inc., all rights reserved.
Author: Skip Montanaro
Copyright 1991-1995, Stichting Mathematisch Centrum, all rights reserved.
Permission to use, copy, modify, and distribute this Python software and
its associated documentation for any purpose without fee is hereby
granted, provided that the above copyright notice appears in all copies,
and that both that copyright notice and this permission notice appear in
supporting documentation, and that the name of neither Automatrix,
Bioreason or Mojam Media be used in advertising or publicity pertaining
to distribution of the software without specific, written prior
permission.
UUencode and UUdecode functions
-------------------------------
The `uu' module contains the following notice:
Copyright 1994 by Lance Ellinghouse
Cathedral City, California Republic, United States of America.
All Rights Reserved
Permission to use, copy, modify, and distribute this software and its
documentation for any purpose and without fee is hereby granted,
provided that the above copyright notice appear in all copies and that
both that copyright notice and this permission notice appear in
supporting documentation, and that the name of Lance Ellinghouse
not be used in advertising or publicity pertaining to distribution
of the software without specific, written prior permission.
LANCE ELLINGHOUSE DISCLAIMS ALL WARRANTIES WITH REGARD TO
THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS, IN NO EVENT SHALL LANCE ELLINGHOUSE CENTRUM BE LIABLE
FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT
OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
Modified by Jack Jansen, CWI, July 1995:
- Use binascii module to do the actual line-by-line conversion
between ascii and binary. This results in a 1000-fold speedup. The C
version is still 5 times faster, though.
- Arguments more compliant with python standard
XML Remote Procedure Calls
--------------------------
The `xmlrpclib' module contains the following notice:
The XML-RPC client interface is
Copyright (c) 1999-2002 by Secret Labs AB
Copyright (c) 1999-2002 by Fredrik Lundh
By obtaining, using, and/or copying this software and/or its
associated documentation, you agree that you have read, understood,
and will comply with the following terms and conditions:
Permission to use, copy, modify, and distribute this software and
its associated documentation for any purpose and without fee is
hereby granted, provided that the above copyright notice appears in
all copies, and that both that copyright notice and this permission
notice appear in supporting documentation, and that the name of
Secret Labs AB or the author not be used in advertising or publicity
pertaining to distribution of the software without specific, written
prior permission.
SECRET LABS AB AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD
TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANT-
ABILITY AND FITNESS. IN NO EVENT SHALL SECRET LABS AB OR THE AUTHOR
BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY
DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS
ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE
OF THIS SOFTWARE.
Licenses for Software linked to
===============================
Note that the choice of GPL compatibility outlined above doesn't extend
to modules linked to particular libraries, since they change the
effective License of the module binary.
GNU Readline
------------
The 'readline' module makes use of GNU Readline.
The GNU Readline Library is free software; you can redistribute it
and/or modify it under the terms of the GNU General Public License as
published by the Free Software Foundation; either version 2, or (at
your option) any later version.
On Debian systems, you can find the complete statement in
/usr/share/doc/readline-common/copyright'. A copy of the GNU General
Public License is available in /usr/share/common-licenses/GPL-2'.
OpenSSL
-------
The '_ssl' module makes use of OpenSSL.
The OpenSSL toolkit stays under a dual license, i.e. both the
conditions of the OpenSSL License and the original SSLeay license
apply to the toolkit. Actually both licenses are BSD-style Open
Source licenses. Note that both licenses are incompatible with
the GPL.
On Debian systems, you can find the complete license text in
/usr/share/doc/openssl/copyright'.
Files with other licenses than the Python License
-------------------------------------------------
Files: Include/dynamic_annotations.h
Files: Python/dynamic_annotations.c
Copyright: (c) 2008-2009, Google Inc.
License: Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Neither the name of Google Inc. nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Files: Include/unicodeobject.h
Copyright: (c) Corporation for National Research Initiatives.
Copyright: (c) 1999 by Secret Labs AB.
Copyright: (c) 1999 by Fredrik Lundh.
License: By obtaining, using, and/or copying this software and/or its
associated documentation, you agree that you have read, understood,
and will comply with the following terms and conditions:
Permission to use, copy, modify, and distribute this software and its
associated documentation for any purpose and without fee is hereby
granted, provided that the above copyright notice appears in all
copies, and that both that copyright notice and this permission notice
appear in supporting documentation, and that the name of Secret Labs
AB or the author not be used in advertising or publicity pertaining to
distribution of the software without specific, written prior
permission.
SECRET LABS AB AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO
THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS. IN NO EVENT SHALL SECRET LABS AB OR THE AUTHOR BE LIABLE FOR
ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT
OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
Files: Lib/logging/*
Copyright: 2001-2010 by Vinay Sajip. All Rights Reserved.
License: Permission to use, copy, modify, and distribute this software and
its documentation for any purpose and without fee is hereby granted,
provided that the above copyright notice appear in all copies and that
both that copyright notice and this permission notice appear in
supporting documentation, and that the name of Vinay Sajip
not be used in advertising or publicity pertaining to distribution
of the software without specific, written prior permission.
VINAY SAJIP DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING
ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL
VINAY SAJIP BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR
ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT
OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
Files: Lib/multiprocessing/*
Files: Modules/_multiprocessing/*
Copyright: (c) 2006-2008, R Oudkerk. All rights reserved.
License: Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. Neither the name of author nor the names of any contributors may be
used to endorse or promote products derived from this software
without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
SUCH DAMAGE.
Files: Lib/sqlite3/*
Files: Modules/_sqlite/*
Copyright: (C) 2004-2005 Gerhard Häring <gh@ghaering.de>
License: This software is provided 'as-is', without any express or implied
warranty. In no event will the authors be held liable for any damages
arising from the use of this software.
Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it
freely, subject to the following restrictions:
1. The origin of this software must not be misrepresented; you must not
claim that you wrote the original software. If you use this software
in a product, an acknowledgment in the product documentation would be
appreciated but is not required.
2. Altered source versions must be plainly marked as such, and must not be
misrepresented as being the original software.
3. This notice may not be removed or altered from any source distribution.
Files: Lib/async*
Copyright: Copyright 1996 by Sam Rushing
License: Permission to use, copy, modify, and distribute this software and
its documentation for any purpose and without fee is hereby
granted, provided that the above copyright notice appear in all
copies and that both that copyright notice and this permission
notice appear in supporting documentation, and that the name of Sam
Rushing not be used in advertising or publicity pertaining to
distribution of the software without specific, written prior
permission.
SAM RUSHING DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN
NO EVENT SHALL SAM RUSHING BE LIABLE FOR ANY SPECIAL, INDIRECT OR
CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS
OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT,
NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN
CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
Files: Lib/tarfile.py
Copyright: (C) 2002 Lars Gustaebel <lars@gustaebel.de>
License: Permission is hereby granted, free of charge, to any person
obtaining a copy of this software and associated documentation
files (the "Software"), to deal in the Software without
restriction, including without limitation the rights to use,
copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the
Software is furnished to do so, subject to the following
conditions:
The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.
Files: Lib/turtle.py
Copyright: (C) 2006 - 2010 Gregor Lingl
License: This software is provided 'as-is', without any express or implied
warranty. In no event will the authors be held liable for any damages
arising from the use of this software.
Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it
freely, subject to the following restrictions:
1. The origin of this software must not be misrepresented; you must not
claim that you wrote the original software. If you use this software
in a product, an acknowledgment in the product documentation would be
appreciated but is not required.
2. Altered source versions must be plainly marked as such, and must not be
misrepresented as being the original software.
3. This notice may not be removed or altered from any source distribution.
is copyright Gregor Lingl and licensed under a BSD-like license
Files: Modules/_ctypes/libffi/*
Copyright: Copyright (C) 1996-2011 Red Hat, Inc and others.
Copyright (C) 1996-2011 Anthony Green
Copyright (C) 1996-2010 Free Software Foundation, Inc
Copyright (c) 2003, 2004, 2006, 2007, 2008 Kaz Kojima
Copyright (c) 2010, 2011, Plausible Labs Cooperative , Inc.
Copyright (c) 2010 CodeSourcery
Copyright (c) 1998 Andreas Schwab
Copyright (c) 2000 Hewlett Packard Company
Copyright (c) 2009 Bradley Smith
Copyright (c) 2008 David Daney
Copyright (c) 2004 Simon Posnjak
Copyright (c) 2005 Axis Communications AB
Copyright (c) 1998 Cygnus Solutions
Copyright (c) 2004 Renesas Technology
Copyright (c) 2002, 2007 Bo Thorsen <bo@suse.de>
Copyright (c) 2002 Ranjit Mathew
Copyright (c) 2002 Roger Sayle
Copyright (c) 2000, 2007 Software AG
Copyright (c) 2003 Jakub Jelinek
Copyright (c) 2000, 2001 John Hornkvist
Copyright (c) 1998 Geoffrey Keating
Copyright (c) 2008 Björn König
License: Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
``Software''), to deal in the Software without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so, subject to
the following conditions:
The above copyright notice and this permission notice shall be included
in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED ``AS IS'', WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
Documentation:
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU General Public License as published by the
Free Software Foundation; either version 2, or (at your option) any
later version. A copy of the license is included in the
section entitled ``GNU General Public License''.
Files: Modules/_gestalt.c
Copyright: 1991-1997 by Stichting Mathematisch Centrum, Amsterdam.
License: Permission to use, copy, modify, and distribute this software and its
documentation for any purpose and without fee is hereby granted,
provided that the above copyright notice appear in all copies and that
both that copyright notice and this permission notice appear in
supporting documentation, and that the names of Stichting Mathematisch
Centrum or CWI not be used in advertising or publicity pertaining to
distribution of the software without specific, written prior permission.
STICHTING MATHEMATISCH CENTRUM DISCLAIMS ALL WARRANTIES WITH REGARD TO
THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM BE LIABLE
FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT
OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
Files: Modules/syslogmodule.c
Copyright: 1994 by Lance Ellinghouse
License: Permission to use, copy, modify, and distribute this software and its
documentation for any purpose and without fee is hereby granted,
provided that the above copyright notice appear in all copies and that
both that copyright notice and this permission notice appear in
supporting documentation, and that the name of Lance Ellinghouse
not be used in advertising or publicity pertaining to distribution
of the software without specific, written prior permission.
LANCE ELLINGHOUSE DISCLAIMS ALL WARRANTIES WITH REGARD TO
THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS, IN NO EVENT SHALL LANCE ELLINGHOUSE BE LIABLE FOR ANY SPECIAL,
INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING
FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT,
NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION
WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
Files: Modules/zlib/*
Copyright: (C) 1995-2010 Jean-loup Gailly and Mark Adler
License: This software is provided 'as-is', without any express or implied
warranty. In no event will the authors be held liable for any damages
arising from the use of this software.
Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it
freely, subject to the following restrictions:
1. The origin of this software must not be misrepresented; you must not
claim that you wrote the original software. If you use this software
in a product, an acknowledgment in the product documentation would be
appreciated but is not required.
2. Altered source versions must be plainly marked as such, and must not be
misrepresented as being the original software.
3. This notice may not be removed or altered from any source distribution.
Jean-loup Gailly Mark Adler
jloup@gzip.org madler@alumni.caltech.edu
If you use the zlib library in a product, we would appreciate *not* receiving
lengthy legal documents to sign. The sources are provided for free but without
warranty of any kind. The library has been entirely written by Jean-loup
Gailly and Mark Adler; it does not include third-party code.
Files: Modules/expat/*
Copyright: Copyright (c) 1998, 1999, 2000 Thai Open Source Software Center Ltd
and Clark Cooper
Copyright (c) 2001, 2002, 2003, 2004, 2005, 2006 Expat maintainers
License: Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
"Software"), to deal in the Software without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so, subject to
the following conditions:
The above copyright notice and this permission notice shall be included
in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Files: Modules/_decimal/libmpdec/*
Copyright: Copyright (c) 2008-2012 Stefan Krah. All rights reserved.
License: Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
.
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
,
THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
SUCH DAMAGE.
Files: Misc/python-mode.el
Copyright: Copyright (C) 1992,1993,1994 Tim Peters
License: This software is provided as-is, without express or implied
warranty. Permission to use, copy, modify, distribute or sell this
software, without fee, for any purpose and by any individual or
organization, is hereby granted, provided that the above copyright
notice and this paragraph appear in all copies.
Files: Python/dtoa.c
Copyright: (c) 1991, 2000, 2001 by Lucent Technologies.
License: Permission to use, copy, modify, and distribute this software for any
purpose without fee is hereby granted, provided that this entire notice
is included in all copies of any software which is or includes a copy
or modification of this software and in all copies of the supporting
documentation for such software.
THIS SOFTWARE IS BEING PROVIDED "AS IS", WITHOUT ANY EXPRESS OR IMPLIED
WARRANTY. IN PARTICULAR, NEITHER THE AUTHOR NOR LUCENT MAKES ANY
REPRESENTATION OR WARRANTY OF ANY KIND CONCERNING THE MERCHANTABILITY
OF THIS SOFTWARE OR ITS FITNESS FOR ANY PARTICULAR PURPOSE.
Files: Python/getopt.c
Copyright: 1992-1994, David Gottner
License: Permission to use, copy, modify, and distribute this software and its
documentation for any purpose and without fee is hereby granted,
provided that the above copyright notice, this permission notice and
the following disclaimer notice appear unmodified in all copies.
I DISCLAIM ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL I
BE LIABLE FOR ANY SPECIAL, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY
DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA, OR PROFITS, WHETHER
IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT
OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
Files: PC/_subprocess.c
Copyright: Copyright (c) 2004 by Fredrik Lundh <fredrik@pythonware.com>
Copyright (c) 2004 by Secret Labs AB, http://www.pythonware.com
Copyright (c) 2004 by Peter Astrand <astrand@lysator.liu.se>
License:
* Permission to use, copy, modify, and distribute this software and
* its associated documentation for any purpose and without fee is
* hereby granted, provided that the above copyright notice appears in
* all copies, and that both that copyright notice and this permission
* notice appear in supporting documentation, and that the name of the
* authors not be used in advertising or publicity pertaining to
* distribution of the software without specific, written prior
* permission.
*
* THE AUTHORS DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
* INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS.
* IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY SPECIAL, INDIRECT OR
* CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS
* OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT,
* NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION
* WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
Files: PC/winsound.c
Copyright: Copyright (c) 1999 Toby Dickenson
License: * Permission to use this software in any way is granted without
* fee, provided that the copyright notice above appears in all
* copies. This software is provided "as is" without any warranty.
*/
/* Modified by Guido van Rossum */
/* Beep added by Mark Hammond */
/* Win9X Beep and platform identification added by Uncle Timmy */
Files: Tools/pybench/*
Copyright: (c), 1997-2006, Marc-Andre Lemburg (mal@lemburg.com)
(c), 2000-2006, eGenix.com Software GmbH (info@egenix.com)
License: Permission to use, copy, modify, and distribute this software and its
documentation for any purpose and without fee or royalty is hereby
granted, provided that the above copyright notice appear in all copies
and that both that copyright notice and this permission notice appear
in supporting documentation or portions thereof, including
modifications, that you make.
THE AUTHOR MARC-ANDRE LEMBURG DISCLAIMS ALL WARRANTIES WITH REGARD TO
THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS, IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL,
INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING
FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT,
NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION
WITH THE USE OR PERFORMANCE OF THIS SOFTWARE !

View File

@@ -0,0 +1,3 @@
opt/scylladb/python3/bin
opt/scylladb/python3/lib64
opt/scylladb/python3/libexec

View File

@@ -0,0 +1,3 @@
bin/* opt/scylladb/python3/bin
lib64/* opt/scylladb/python3/lib64
libexec/* opt/scylladb/python3/libexec

22
dist/debian/python3/rules.mustache vendored Executable file
View File

@@ -0,0 +1,22 @@
#!/usr/bin/make -f
export PYBUILD_DISABLE=1
override_dh_auto_configure:
override_dh_auto_build:
override_dh_strip:
override_dh_makeshlibs:
override_dh_shlibdeps:
override_dh_fixperms:
dh_fixperms
chmod 755 $(CURDIR)/debian/{{product}}-python3/opt/scylladb/python3/libexec/ld.so
override_dh_strip_nondeterminism:
%:
dh $@

View File

@@ -15,6 +15,14 @@ override_dh_auto_install:
ln -sf /opt/scylladb/bin/scylla $(CURDIR)/debian/scylla-server/usr/bin/scylla
ln -sf /opt/scylladb/bin/iotune $(CURDIR)/debian/scylla-server/usr/bin/iotune
ln -sf /usr/lib/scylla/scyllatop/scyllatop.py $(CURDIR)/debian/scylla-server/usr/bin/scyllatop
find ./dist/common/scripts -type f -exec ./relocate_python_scripts.py \
--installroot $(CURDIR)/debian/scylla-server/usr/lib/scylla/ --with-python3 "$(CURDIR)/debian/scylla-server/opt/scylladb/python3/bin/python3" {} +
./relocate_python_scripts.py \
--installroot $(CURDIR)/debian/scylla-server/usr/lib/scylla/ --with-python3 "$(CURDIR)/debian/scylla-server/opt/scylladb/python3/bin/python3" \
seastar/scripts/perftune.py seastar/scripts/seastar-addr2line seastar/scripts/perftune.py
./relocate_python_scripts.py \
--installroot $(CURDIR)/debian/scylla-server/usr/lib/scylla/scyllatop/ --with-python3 "$(CURDIR)/debian/scylla-server/opt/scylladb/python3/bin/python3" \
tools/scyllatop/scyllatop.py
override_dh_installinit:
{{#scylla}}

View File

@@ -1,11 +1,7 @@
dist/common/limits.d/scylla.conf etc/security/limits.d
dist/common/scylla.d/*.conf etc/scylla.d
seastar/dpdk/usertools/dpdk-devbind.py usr/lib/scylla
seastar/scripts/perftune.py usr/lib/scylla
seastar/scripts/seastar-addr2line usr/lib/scylla
seastar/scripts/seastar-cpu-map.sh usr/lib/scylla
dist/common/scripts/* usr/lib/scylla
tools/scyllatop usr/lib/scylla
swagger-ui/dist usr/lib/scylla/swagger-ui
api/api-doc usr/lib/scylla/api
bin/* opt/scylladb/bin

View File

@@ -28,7 +28,7 @@ ADD commandlineparser.py /commandlineparser.py
ADD docker-entrypoint.py /docker-entrypoint.py
ADD node_exporter_install /node_exporter_install
# Install Scylla:
RUN curl http://downloads.scylladb.com/rpm/unstable/centos/master/latest/scylla.repo -o /etc/yum.repos.d/scylla.repo && \
RUN curl http://downloads.scylladb.com/rpm/centos/scylla-3.1.repo -o /etc/yum.repos.d/scylla.repo && \
yum -y install epel-release && \
yum -y clean expire-cache && \
yum -y update && \

View File

@@ -4,3 +4,4 @@ stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
user=scylla

View File

@@ -4,3 +4,4 @@ stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
user=scylla

View File

@@ -4,3 +4,4 @@ stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
user=scylla

View File

@@ -151,7 +151,7 @@ if __name__ == '__main__':
argp.add_argument('--user', '-u')
argp.add_argument('--password', '-p', default='none')
argp.add_argument('--node', default='127.0.0.1', help='Node to connect to.')
argp.add_argument('--port', default='9042', help='Port to connect to.')
argp.add_argument('--port', default=9042, help='Port to connect to.', type=int)
args = argp.parse_args()
res = validate_and_fix(args)

View File

@@ -22,6 +22,7 @@
#pragma once
#include "clocks-impl.hh"
#include "hashing.hh"
#include <seastar/core/lowres_clock.hh>
@@ -71,3 +72,17 @@ using ttl_opt = std::optional<gc_clock::duration>;
static constexpr gc_clock::duration max_ttl = gc_clock::duration{20 * 365 * 24 * 60 * 60};
std::ostream& operator<<(std::ostream& os, gc_clock::time_point tp);
template<>
struct appending_hash<gc_clock::time_point> {
template<typename Hasher>
void operator()(Hasher& h, gc_clock::time_point t) const {
// Remain backwards-compatible with the 32-bit duration::rep (refs #4460).
uint64_t d64 = t.time_since_epoch().count();
feed_hash(h, uint32_t(d64 & 0xffff'ffff));
uint32_t msb = d64 >> 32;
if (msb) {
feed_hash(h, msb);
}
}
};

View File

@@ -29,7 +29,7 @@ template <typename T> struct hasher_traits;
template <> struct hasher_traits<md5_hasher> { using impl_type = CryptoPP::Weak::MD5; };
template <> struct hasher_traits<sha256_hasher> { using impl_type = CryptoPP::SHA256; };
template <typename T, size_t size> struct hasher<T, size>::impl {
template <typename T, size_t size> struct cryptopp_hasher<T, size>::impl {
using impl_type = typename hasher_traits<T>::impl_type;
impl_type hash{};
@@ -53,35 +53,35 @@ template <typename T, size_t size> struct hasher<T, size>::impl {
}
};
template <typename T, size_t size> hasher<T, size>::hasher() : _impl(std::make_unique<impl>()) {}
template <typename T, size_t size> cryptopp_hasher<T, size>::cryptopp_hasher() : _impl(std::make_unique<impl>()) {}
template <typename T, size_t size> hasher<T, size>::~hasher() = default;
template <typename T, size_t size> cryptopp_hasher<T, size>::~cryptopp_hasher() = default;
template <typename T, size_t size> hasher<T, size>::hasher(hasher&& o) noexcept = default;
template <typename T, size_t size> cryptopp_hasher<T, size>::cryptopp_hasher(cryptopp_hasher&& o) noexcept = default;
template <typename T, size_t size> hasher<T, size>::hasher(const hasher& o) : _impl(std::make_unique<hasher<T, size>::impl>(*o._impl)) {}
template <typename T, size_t size> cryptopp_hasher<T, size>::cryptopp_hasher(const cryptopp_hasher& o) : _impl(std::make_unique<cryptopp_hasher<T, size>::impl>(*o._impl)) {}
template <typename T, size_t size> hasher<T, size>& hasher<T, size>::operator=(hasher&& o) noexcept = default;
template <typename T, size_t size> cryptopp_hasher<T, size>& cryptopp_hasher<T, size>::operator=(cryptopp_hasher&& o) noexcept = default;
template <typename T, size_t size> hasher<T, size>& hasher<T, size>::operator=(const hasher& o) {
_impl = std::make_unique<hasher<T, size>::impl>(*o._impl);
template <typename T, size_t size> cryptopp_hasher<T, size>& cryptopp_hasher<T, size>::operator=(const cryptopp_hasher& o) {
_impl = std::make_unique<cryptopp_hasher<T, size>::impl>(*o._impl);
return *this;
}
template <typename T, size_t size> bytes hasher<T, size>::finalize() { return _impl->finalize(); }
template <typename T, size_t size> bytes cryptopp_hasher<T, size>::finalize() { return _impl->finalize(); }
template <typename T, size_t size> std::array<uint8_t, size> hasher<T, size>::finalize_array() {
template <typename T, size_t size> std::array<uint8_t, size> cryptopp_hasher<T, size>::finalize_array() {
return _impl->finalize_array();
}
template <typename T, size_t size> void hasher<T, size>::update(const char* ptr, size_t length) { _impl->update(ptr, length); }
template <typename T, size_t size> void cryptopp_hasher<T, size>::update(const char* ptr, size_t length) { _impl->update(ptr, length); }
template <typename T, size_t size> bytes hasher<T, size>::calculate(const std::string_view& s) {
typename hasher<T, size>::impl::impl_type hash;
template <typename T, size_t size> bytes cryptopp_hasher<T, size>::calculate(const std::string_view& s) {
typename cryptopp_hasher<T, size>::impl::impl_type hash;
unsigned char digest[size];
hash.CalculateDigest(digest, reinterpret_cast<const unsigned char*>(s.data()), s.size());
return std::move(bytes{reinterpret_cast<const int8_t*>(digest), size});
}
template class hasher<md5_hasher, 16>;
template class hasher<sha256_hasher, 32>;
template class cryptopp_hasher<md5_hasher, 16>;
template class cryptopp_hasher<sha256_hasher, 32>;

View File

@@ -22,29 +22,30 @@
#pragma once
#include "bytes.hh"
#include "hashing.hh"
class md5_hasher;
template <typename T, size_t size> class hasher {
template <typename T, size_t size> class cryptopp_hasher : public hasher {
struct impl;
std::unique_ptr<impl> _impl;
public:
hasher();
~hasher();
hasher(hasher&&) noexcept;
hasher(const hasher&);
hasher& operator=(hasher&&) noexcept;
hasher& operator=(const hasher&);
cryptopp_hasher();
~cryptopp_hasher();
cryptopp_hasher(cryptopp_hasher&&) noexcept;
cryptopp_hasher(const cryptopp_hasher&);
cryptopp_hasher& operator=(cryptopp_hasher&&) noexcept;
cryptopp_hasher& operator=(const cryptopp_hasher&);
bytes finalize();
std::array<uint8_t, size> finalize_array();
void update(const char* ptr, size_t length);
void update(const char* ptr, size_t length) override;
// Use update and finalize to compute the hash over the full view.
static bytes calculate(const std::string_view& s);
};
class md5_hasher : public hasher<md5_hasher, 16> {};
class md5_hasher final : public cryptopp_hasher<md5_hasher, 16> {};
class sha256_hasher : public hasher<sha256_hasher, 32> {};
class sha256_hasher final : public cryptopp_hasher<sha256_hasher, 32> {};

View File

@@ -27,6 +27,7 @@
#include <seastar/core/byteorder.hh>
#include <seastar/core/sstring.hh>
#include "seastarx.hh"
#include <seastar/util/gcc6-concepts.hh>
//
// This hashing differs from std::hash<> in that it decouples knowledge about
@@ -41,24 +42,38 @@
// appending_hash<T> is machine-independent.
//
// The Hasher concept
struct Hasher {
void update(const char* ptr, size_t size);
GCC6_CONCEPT(
template<typename H>
concept bool Hasher() {
return requires(H& h, const char* ptr, size_t size) {
{ h.update(ptr, size) } -> void
};
}
)
class hasher {
public:
virtual ~hasher() = default;
virtual void update(const char* ptr, size_t size) = 0;
};
GCC6_CONCEPT(static_assert(Hasher<hasher>());)
template<typename T, typename Enable = void>
struct appending_hash;
template<typename Hasher, typename T, typename... Args>
template<typename H, typename T, typename... Args>
GCC6_CONCEPT(requires Hasher<H>())
inline
void feed_hash(Hasher& h, const T& value, Args&&... args) {
void feed_hash(H& h, const T& value, Args&&... args) {
appending_hash<T>()(h, value, std::forward<Args>(args)...);
};
template<typename T>
struct appending_hash<T, std::enable_if_t<std::is_arithmetic<T>::value>> {
template<typename Hasher>
void operator()(Hasher& h, T value) const {
template<typename H>
GCC6_CONCEPT(requires Hasher<H>())
void operator()(H& h, T value) const {
auto value_le = cpu_to_le(value);
h.update(reinterpret_cast<const char*>(&value_le), sizeof(T));
}
@@ -66,24 +81,27 @@ struct appending_hash<T, std::enable_if_t<std::is_arithmetic<T>::value>> {
template<>
struct appending_hash<bool> {
template<typename Hasher>
void operator()(Hasher& h, bool value) const {
template<typename H>
GCC6_CONCEPT(requires Hasher<H>())
void operator()(H& h, bool value) const {
feed_hash(h, static_cast<uint8_t>(value));
}
};
template<typename T>
struct appending_hash<T, std::enable_if_t<std::is_enum<T>::value>> {
template<typename Hasher>
void operator()(Hasher& h, const T& value) const {
template<typename H>
GCC6_CONCEPT(requires Hasher<H>())
void operator()(H& h, const T& value) const {
feed_hash(h, static_cast<std::underlying_type_t<T>>(value));
}
};
template<typename T>
struct appending_hash<std::optional<T>> {
template<typename Hasher>
void operator()(Hasher& h, const std::optional<T>& value) const {
template<typename H>
GCC6_CONCEPT(requires Hasher<H>())
void operator()(H& h, const std::optional<T>& value) const {
if (value) {
feed_hash(h, true);
feed_hash(h, *value);
@@ -95,8 +113,9 @@ struct appending_hash<std::optional<T>> {
template<size_t N>
struct appending_hash<char[N]> {
template<typename Hasher>
void operator()(Hasher& h, const char (&value) [N]) const {
template<typename H>
GCC6_CONCEPT(requires Hasher<H>())
void operator()(H& h, const char (&value) [N]) const {
feed_hash(h, N);
h.update(value, N);
}
@@ -104,8 +123,9 @@ struct appending_hash<char[N]> {
template<typename T>
struct appending_hash<std::vector<T>> {
template<typename Hasher>
void operator()(Hasher& h, const std::vector<T>& value) const {
template<typename H>
GCC6_CONCEPT(requires Hasher<H>())
void operator()(H& h, const std::vector<T>& value) const {
feed_hash(h, value.size());
for (auto&& v : value) {
appending_hash<T>()(h, v);
@@ -115,8 +135,9 @@ struct appending_hash<std::vector<T>> {
template<typename K, typename V>
struct appending_hash<std::map<K, V>> {
template<typename Hasher>
void operator()(Hasher& h, const std::map<K, V>& value) const {
template<typename H>
GCC6_CONCEPT(requires Hasher<H>())
void operator()(H& h, const std::map<K, V>& value) const {
feed_hash(h, value.size());
for (auto&& e : value) {
appending_hash<K>()(h, e.first);
@@ -127,8 +148,9 @@ struct appending_hash<std::map<K, V>> {
template<>
struct appending_hash<sstring> {
template<typename Hasher>
void operator()(Hasher& h, const sstring& v) const {
template<typename H>
GCC6_CONCEPT(requires Hasher<H>())
void operator()(H& h, const sstring& v) const {
feed_hash(h, v.size());
h.update(reinterpret_cast<const char*>(v.cbegin()), v.size() * sizeof(sstring::value_type));
}
@@ -136,8 +158,9 @@ struct appending_hash<sstring> {
template<>
struct appending_hash<std::string> {
template<typename Hasher>
void operator()(Hasher& h, const std::string& v) const {
template<typename H>
GCC6_CONCEPT(requires Hasher<H>())
void operator()(H& h, const std::string& v) const {
feed_hash(h, v.size());
h.update(reinterpret_cast<const char*>(v.data()), v.size() * sizeof(std::string::value_type));
}
@@ -145,16 +168,18 @@ struct appending_hash<std::string> {
template<typename T, typename R>
struct appending_hash<std::chrono::duration<T, R>> {
template<typename Hasher>
void operator()(Hasher& h, std::chrono::duration<T, R> v) const {
template<typename H>
GCC6_CONCEPT(requires Hasher<H>())
void operator()(H& h, std::chrono::duration<T, R> v) const {
feed_hash(h, v.count());
}
};
template<typename Clock, typename Duration>
struct appending_hash<std::chrono::time_point<Clock, Duration>> {
template<typename Hasher>
void operator()(Hasher& h, std::chrono::time_point<Clock, Duration> v) const {
template<typename H>
GCC6_CONCEPT(requires Hasher<H>())
void operator()(H& h, std::chrono::time_point<Clock, Duration> v) const {
feed_hash(h, v.time_since_epoch().count());
}
};

View File

@@ -916,8 +916,10 @@ int main(int ac, char** av) {
service::get_local_storage_service().drain_on_shutdown().get();
});
auto stop_view_builder = defer([] {
view_builder.stop().get();
auto stop_view_builder = defer([cfg] {
if (cfg->view_building()) {
view_builder.stop().get();
}
});
auto stop_compaction_manager = defer([&db] {

View File

@@ -1077,14 +1077,14 @@ future<> messaging_service::send_repair_put_row_diff(msg_addr id, uint32_t repai
}
// Wrapper for REPAIR_ROW_LEVEL_START
void messaging_service::register_repair_row_level_start(std::function<future<> (const rpc::client_info& cinfo, uint32_t repair_meta_id, sstring keyspace_name, sstring cf_name, dht::token_range range, row_level_diff_detect_algorithm algo, uint64_t max_row_buf_size, uint64_t seed, unsigned remote_shard, unsigned remote_shard_count, unsigned remote_ignore_msb, sstring remote_partitioner_name)>&& func) {
void messaging_service::register_repair_row_level_start(std::function<future<> (const rpc::client_info& cinfo, uint32_t repair_meta_id, sstring keyspace_name, sstring cf_name, dht::token_range range, row_level_diff_detect_algorithm algo, uint64_t max_row_buf_size, uint64_t seed, unsigned remote_shard, unsigned remote_shard_count, unsigned remote_ignore_msb, sstring remote_partitioner_name, table_schema_version schema_version)>&& func) {
register_handler(this, messaging_verb::REPAIR_ROW_LEVEL_START, std::move(func));
}
void messaging_service::unregister_repair_row_level_start() {
_rpc->unregister_handler(messaging_verb::REPAIR_ROW_LEVEL_START);
}
future<> messaging_service::send_repair_row_level_start(msg_addr id, uint32_t repair_meta_id, sstring keyspace_name, sstring cf_name, dht::token_range range, row_level_diff_detect_algorithm algo, uint64_t max_row_buf_size, uint64_t seed, unsigned remote_shard, unsigned remote_shard_count, unsigned remote_ignore_msb, sstring remote_partitioner_name) {
return send_message<void>(this, messaging_verb::REPAIR_ROW_LEVEL_START, std::move(id), repair_meta_id, std::move(keyspace_name), std::move(cf_name), std::move(range), algo, max_row_buf_size, seed, remote_shard, remote_shard_count, remote_ignore_msb, std::move(remote_partitioner_name));
future<> messaging_service::send_repair_row_level_start(msg_addr id, uint32_t repair_meta_id, sstring keyspace_name, sstring cf_name, dht::token_range range, row_level_diff_detect_algorithm algo, uint64_t max_row_buf_size, uint64_t seed, unsigned remote_shard, unsigned remote_shard_count, unsigned remote_ignore_msb, sstring remote_partitioner_name, table_schema_version schema_version) {
return send_message<void>(this, messaging_verb::REPAIR_ROW_LEVEL_START, std::move(id), repair_meta_id, std::move(keyspace_name), std::move(cf_name), std::move(range), algo, max_row_buf_size, seed, remote_shard, remote_shard_count, remote_ignore_msb, std::move(remote_partitioner_name), std::move(schema_version));
}
// Wrapper for REPAIR_ROW_LEVEL_STOP

View File

@@ -311,9 +311,9 @@ public:
future<> send_repair_put_row_diff(msg_addr id, uint32_t repair_meta_id, repair_rows_on_wire row_diff);
// Wrapper for REPAIR_ROW_LEVEL_START
void register_repair_row_level_start(std::function<future<> (const rpc::client_info& cinfo, uint32_t repair_meta_id, sstring keyspace_name, sstring cf_name, dht::token_range range, row_level_diff_detect_algorithm algo, uint64_t max_row_buf_size, uint64_t seed, unsigned remote_shard, unsigned remote_shard_count, unsigned remote_ignore_msb, sstring remote_partitioner_name)>&& func);
void register_repair_row_level_start(std::function<future<> (const rpc::client_info& cinfo, uint32_t repair_meta_id, sstring keyspace_name, sstring cf_name, dht::token_range range, row_level_diff_detect_algorithm algo, uint64_t max_row_buf_size, uint64_t seed, unsigned remote_shard, unsigned remote_shard_count, unsigned remote_ignore_msb, sstring remote_partitioner_name, table_schema_version schema_version)>&& func);
void unregister_repair_row_level_start();
future<> send_repair_row_level_start(msg_addr id, uint32_t repair_meta_id, sstring keyspace_name, sstring cf_name, dht::token_range range, row_level_diff_detect_algorithm algo, uint64_t max_row_buf_size, uint64_t seed, unsigned remote_shard, unsigned remote_shard_count, unsigned remote_ignore_msb, sstring remote_partitioner_name);
future<> send_repair_row_level_start(msg_addr id, uint32_t repair_meta_id, sstring keyspace_name, sstring cf_name, dht::token_range range, row_level_diff_detect_algorithm algo, uint64_t max_row_buf_size, uint64_t seed, unsigned remote_shard, unsigned remote_shard_count, unsigned remote_ignore_msb, sstring remote_partitioner_name, table_schema_version schema_version);
// Wrapper for REPAIR_ROW_LEVEL_STOP
void register_repair_row_level_stop(std::function<future<> (const rpc::client_info& cinfo, uint32_t repair_meta_id, sstring keyspace_name, sstring cf_name, dht::token_range range)>&& func);

37
reloc/python3/build_deb.sh Executable file
View File

@@ -0,0 +1,37 @@
#!/bin/bash -e
. /etc/os-release
print_usage() {
echo "build_deb.sh --reloc-pkg build/release/scylla-python3-package.tar.gz"
echo " --reloc-pkg specify relocatable package path"
exit 1
}
RELOC_PKG=build/release/scylla-python3-package.tar.gz
OPTS=""
while [ $# -gt 0 ]; do
case "$1" in
"--reloc-pkg")
OPTS="$OPTS $1 $(readlink -f $2)"
RELOC_PKG=$2
shift 2
;;
*)
print_usage
;;
esac
done
if [ ! -e $RELOC_PKG ]; then
echo "$RELOC_PKG does not exist."
echo "Run ./reloc/python3/build_reloc.sh first."
exit 1
fi
RELOC_PKG=$(readlink -f $RELOC_PKG)
if [[ ! $OPTS =~ --reloc-pkg ]]; then
OPTS="$OPTS --reloc-pkg $RELOC_PKG"
fi
mkdir -p build/debian/scylla-python3-package
tar -C build/debian/scylla-python3-package -xpf $RELOC_PKG
cd build/debian/scylla-python3-package
exec ./dist/debian/python3/build_deb.sh $OPTS

View File

@@ -940,8 +940,20 @@ static future<> repair_cf_range(repair_info& ri,
// Comparable to RepairSession in Origin
static future<> repair_range(repair_info& ri, const dht::token_range& range) {
auto id = utils::UUID_gen::get_time_UUID();
return do_with(get_neighbors(ri.db.local(), ri.keyspace, range, ri.data_centers, ri.hosts), [&ri, range, id] (const auto& neighbors) {
rlogger.debug("[repair #{}] new session: will sync {} on range {} for {}.{}", id, neighbors, range, ri.keyspace, ri.cfs);
return do_with(get_neighbors(ri.db.local(), ri.keyspace, range, ri.data_centers, ri.hosts), [&ri, range, id] (std::vector<gms::inet_address>& neighbors) {
auto live_neighbors = boost::copy_range<std::vector<gms::inet_address>>(neighbors |
boost::adaptors::filtered([] (const gms::inet_address& node) { return gms::get_local_gossiper().is_alive(node); }));
if (live_neighbors.size() != neighbors.size()) {
ri.nr_failed_ranges++;
auto status = live_neighbors.empty() ? "skipped" : "partial";
rlogger.warn("Repair {} out of {} ranges, id={}, shard={}, keyspace={}, table={}, range={}, peers={}, live_peers={}, status={}",
ri.ranges_index, ri.ranges.size(), ri.id, ri.shard, ri.keyspace, ri.cfs, range, neighbors, live_neighbors, status);
if (live_neighbors.empty()) {
return make_ready_future<>();
}
neighbors.swap(live_neighbors);
}
return ::service::get_local_migration_manager().sync_schema(ri.db.local(), neighbors).then([&neighbors, &ri, range, id] {
return do_for_each(ri.cfs.begin(), ri.cfs.end(), [&ri, &neighbors, range] (auto&& cf) {
ri._sub_ranges_nr++;
if (ri.row_level_repair()) {
@@ -950,6 +962,7 @@ static future<> repair_range(repair_info& ri, const dht::token_range& range) {
return repair_cf_range(ri, cf, range, neighbors);
}
});
});
});
}

View File

@@ -295,6 +295,7 @@ public:
void push_mutation_fragment(frozen_mutation_fragment mf) { _mfs.push_back(std::move(mf)); }
};
using repair_row_on_wire = partition_key_and_mutation_fragments;
using repair_rows_on_wire = std::list<partition_key_and_mutation_fragments>;
enum class row_level_diff_detect_algorithm : uint8_t {

View File

@@ -152,8 +152,8 @@ class fragment_hasher {
xx_hasher& _hasher;
private:
void consume_cell(const column_definition& col, const atomic_cell_or_collection& cell) {
feed_hash(_hasher, col.name());
feed_hash(_hasher, col.type->name());
feed_hash(_hasher, col.kind);
feed_hash(_hasher, col.id);
feed_hash(_hasher, cell, col);
}
public:
@@ -220,43 +220,62 @@ private:
};
class repair_row {
frozen_mutation_fragment _fm;
std::optional<frozen_mutation_fragment> _fm;
lw_shared_ptr<const decorated_key_with_hash> _dk_with_hash;
repair_sync_boundary _boundary;
repair_hash _hash;
std::optional<repair_sync_boundary> _boundary;
std::optional<repair_hash> _hash;
lw_shared_ptr<mutation_fragment> _mf;
public:
repair_row() = delete;
repair_row(frozen_mutation_fragment fm,
position_in_partition pos,
repair_row(std::optional<frozen_mutation_fragment> fm,
std::optional<position_in_partition> pos,
lw_shared_ptr<const decorated_key_with_hash> dk_with_hash,
repair_hash hash,
std::optional<repair_hash> hash,
lw_shared_ptr<mutation_fragment> mf = {})
: _fm(std::move(fm))
, _dk_with_hash(std::move(dk_with_hash))
, _boundary({_dk_with_hash->dk, std::move(pos)})
, _boundary(pos ? std::optional<repair_sync_boundary>(repair_sync_boundary{_dk_with_hash->dk, std::move(*pos)}) : std::nullopt)
, _hash(std::move(hash))
, _mf(std::move(mf)) {
}
mutation_fragment& get_mutation_fragment() {
if (!_mf) {
throw std::runtime_error("get empty mutation_fragment");
throw std::runtime_error("empty mutation_fragment");
}
return *_mf;
}
frozen_mutation_fragment& get_frozen_mutation() { return _fm; }
const frozen_mutation_fragment& get_frozen_mutation() const { return _fm; }
frozen_mutation_fragment& get_frozen_mutation() {
if (!_fm) {
throw std::runtime_error("empty frozen_mutation_fragment");
}
return *_fm;
}
const frozen_mutation_fragment& get_frozen_mutation() const {
if (!_fm) {
throw std::runtime_error("empty frozen_mutation_fragment");
}
return *_fm;
}
const lw_shared_ptr<const decorated_key_with_hash>& get_dk_with_hash() const {
return _dk_with_hash;
}
size_t size() const {
return _fm.representation().size();
if (!_fm) {
throw std::runtime_error("empty size due to empty frozen_mutation_fragment");
}
return _fm->representation().size();
}
const repair_sync_boundary& boundary() const {
return _boundary;
if (!_boundary) {
throw std::runtime_error("empty repair_sync_boundary");
}
return *_boundary;
}
const repair_hash& hash() const {
return _hash;
if (!_hash) {
throw std::runtime_error("empty hash");
}
return *_hash;
}
};
@@ -284,13 +303,14 @@ public:
repair_reader(
seastar::sharded<database>& db,
column_family& cf,
schema_ptr s,
dht::token_range range,
dht::i_partitioner& local_partitioner,
dht::i_partitioner& remote_partitioner,
unsigned remote_shard,
uint64_t seed,
is_local_reader local_reader)
: _schema(cf.schema())
: _schema(s)
, _range(dht::to_partition_range(range))
, _sharder(remote_partitioner, range, remote_shard)
, _seed(seed)
@@ -458,8 +478,8 @@ public:
private:
seastar::sharded<database>& _db;
column_family& _cf;
dht::token_range _range;
schema_ptr _schema;
dht::token_range _range;
repair_sync_boundary::tri_compare _cmp;
// The algorithm used to find the row difference
row_level_diff_detect_algorithm _algo;
@@ -519,6 +539,7 @@ public:
repair_meta(
seastar::sharded<database>& db,
column_family& cf,
schema_ptr s,
dht::token_range range,
row_level_diff_detect_algorithm algo,
size_t max_row_buf_size,
@@ -529,8 +550,8 @@ public:
size_t nr_peer_nodes = 1)
: _db(db)
, _cf(cf)
, _schema(s)
, _range(range)
, _schema(cf.schema())
, _cmp(repair_sync_boundary::tri_compare(*_schema))
, _algo(algo)
, _max_row_buf_size(max_row_buf_size)
@@ -545,6 +566,7 @@ public:
, _repair_reader(
_db,
_cf,
_schema,
_range,
dht::global_partitioner(),
*_remote_partitioner,
@@ -577,35 +599,45 @@ public:
}
}
static void
static future<>
insert_repair_meta(const gms::inet_address& from,
uint32_t src_cpu_id,
uint32_t repair_meta_id,
sstring ks_name,
sstring cf_name,
dht::token_range range,
row_level_diff_detect_algorithm algo,
uint64_t max_row_buf_size,
uint64_t seed,
shard_config master_node_shard_config) {
node_repair_meta_id id{from, repair_meta_id};
auto& db = service::get_local_storage_proxy().get_db();
auto& cf = db.local().find_column_family(ks_name, cf_name);
auto rm = make_lw_shared<repair_meta>(db,
cf,
shard_config master_node_shard_config,
table_schema_version schema_version) {
return service::get_schema_for_write(schema_version, {from, src_cpu_id}).then([from,
repair_meta_id,
range,
algo,
max_row_buf_size,
seed,
repair_meta::repair_master::no,
repair_meta_id,
std::move(master_node_shard_config));
bool insertion = repair_meta_map().emplace(id, rm).second;
if (!insertion) {
rlogger.warn("insert_repair_meta: repair_meta_id {} for node {} already exists, replace existing one", id.repair_meta_id, id.ip);
repair_meta_map()[id] = rm;
} else {
rlogger.debug("insert_repair_meta: Inserted repair_meta_id {} for node {}", id.repair_meta_id, id.ip);
}
master_node_shard_config,
schema_version] (schema_ptr s) {
auto& db = service::get_local_storage_proxy().get_db();
auto& cf = db.local().find_column_family(s->id());
node_repair_meta_id id{from, repair_meta_id};
auto rm = make_lw_shared<repair_meta>(db,
cf,
s,
range,
algo,
max_row_buf_size,
seed,
repair_meta::repair_master::no,
repair_meta_id,
std::move(master_node_shard_config));
bool insertion = repair_meta_map().emplace(id, rm).second;
if (!insertion) {
rlogger.warn("insert_repair_meta: repair_meta_id {} for node {} already exists, replace existing one", id.repair_meta_id, id.ip);
repair_meta_map()[id] = rm;
} else {
rlogger.debug("insert_repair_meta: Inserted repair_meta_id {} for node {}", id.repair_meta_id, id.ip);
}
});
}
static future<>
@@ -642,7 +674,11 @@ public:
}
}
return parallel_for_each(*repair_metas, [repair_metas] (auto& rm) {
return rm->stop();
return rm->stop().then([&rm] {
rm = {};
});
}).then([repair_metas, from] {
rlogger.debug("Removed all repair_meta for single node {}", from);
});
}
@@ -654,7 +690,11 @@ public:
| boost::adaptors::map_values));
repair_meta_map().clear();
return parallel_for_each(*repair_metas, [repair_metas] (auto& rm) {
return rm->stop();
return rm->stop().then([&rm] {
rm = {};
});
}).then([repair_metas] {
rlogger.debug("Removed all repair_meta for all nodes");
});
}
@@ -952,12 +992,12 @@ private:
}
return to_repair_rows_list(rows).then([this, from, node_idx, update_buf, update_hash_set] (std::list<repair_row> row_diff) {
return do_with(std::move(row_diff), [this, from, node_idx, update_buf, update_hash_set] (std::list<repair_row>& row_diff) {
auto sz = get_repair_rows_size(row_diff);
stats().rx_row_bytes += sz;
stats().rx_row_nr += row_diff.size();
stats().rx_row_nr_peer[from] += row_diff.size();
_metrics.rx_row_nr += row_diff.size();
_metrics.rx_row_bytes += sz;
if (_repair_master) {
auto sz = get_repair_rows_size(row_diff);
stats().rx_row_bytes += sz;
stats().rx_row_nr += row_diff.size();
stats().rx_row_nr_peer[from] += row_diff.size();
}
if (update_buf) {
std::list<repair_row> tmp;
tmp.swap(_working_row_buf);
@@ -993,11 +1033,16 @@ private:
return do_with(repair_rows_on_wire(), std::move(row_list), [this] (repair_rows_on_wire& rows, std::list<repair_row>& row_list) {
return do_for_each(row_list, [this, &rows] (repair_row& r) {
auto pk = r.get_dk_with_hash()->dk.key();
auto it = std::find_if(rows.begin(), rows.end(), [&pk, s=_schema] (partition_key_and_mutation_fragments& row) { return pk.legacy_equal(*s, row.get_key()); });
if (it == rows.end()) {
rows.push_back(partition_key_and_mutation_fragments(std::move(pk), {std::move(r.get_frozen_mutation())}));
// No need to search from the beginning of the rows. Look at the end of repair_rows_on_wire is enough.
if (rows.empty()) {
rows.push_back(repair_row_on_wire(std::move(pk), {std::move(r.get_frozen_mutation())}));
} else {
it->push_mutation_fragment(std::move(r.get_frozen_mutation()));
auto& row = rows.back();
if (pk.legacy_equal(*_schema, row.get_key())) {
row.push_mutation_fragment(std::move(r.get_frozen_mutation()));
} else {
rows.push_back(repair_row_on_wire(std::move(pk), {std::move(r.get_frozen_mutation())}));
}
}
}).then([&rows] {
return std::move(rows);
@@ -1006,23 +1051,47 @@ private:
};
future<std::list<repair_row>> to_repair_rows_list(repair_rows_on_wire rows) {
return do_with(std::move(rows), std::list<repair_row>(), lw_shared_ptr<const decorated_key_with_hash>(),
[this] (repair_rows_on_wire& rows, std::list<repair_row>& row_list, lw_shared_ptr<const decorated_key_with_hash>& dk_ptr) mutable {
return do_for_each(rows, [this, &dk_ptr, &row_list] (partition_key_and_mutation_fragments& x) mutable {
return do_with(std::move(rows), std::list<repair_row>(), lw_shared_ptr<const decorated_key_with_hash>(), lw_shared_ptr<mutation_fragment>(), position_in_partition::tri_compare(*_schema),
[this] (repair_rows_on_wire& rows, std::list<repair_row>& row_list, lw_shared_ptr<const decorated_key_with_hash>& dk_ptr, lw_shared_ptr<mutation_fragment>& last_mf, position_in_partition::tri_compare& cmp) mutable {
return do_for_each(rows, [this, &dk_ptr, &row_list, &last_mf, &cmp] (partition_key_and_mutation_fragments& x) mutable {
dht::decorated_key dk = dht::global_partitioner().decorate_key(*_schema, x.get_key());
if (!(dk_ptr && dk_ptr->dk.equal(*_schema, dk))) {
dk_ptr = make_lw_shared<const decorated_key_with_hash>(*_schema, dk, _seed);
}
return do_for_each(x.get_mutation_fragments(), [this, &dk_ptr, &row_list] (frozen_mutation_fragment& fmf) mutable {
// Keep the mutation_fragment in repair_row as an
// optimization to avoid unfreeze again when
// mutation_fragment is needed by _repair_writer.do_write()
// to apply the repair_row to disk
auto mf = make_lw_shared<mutation_fragment>(fmf.unfreeze(*_schema));
auto hash = do_hash_for_mf(*dk_ptr, *mf);
position_in_partition pos(mf->position());
row_list.push_back(repair_row(std::move(fmf), std::move(pos), dk_ptr, std::move(hash), std::move(mf)));
});
if (_repair_master) {
return do_for_each(x.get_mutation_fragments(), [this, &dk_ptr, &row_list] (frozen_mutation_fragment& fmf) mutable {
_metrics.rx_row_nr += 1;
_metrics.rx_row_bytes += fmf.representation().size();
// Keep the mutation_fragment in repair_row as an
// optimization to avoid unfreeze again when
// mutation_fragment is needed by _repair_writer.do_write()
// to apply the repair_row to disk
auto mf = make_lw_shared<mutation_fragment>(fmf.unfreeze(*_schema));
auto hash = do_hash_for_mf(*dk_ptr, *mf);
position_in_partition pos(mf->position());
row_list.push_back(repair_row(std::move(fmf), std::move(pos), dk_ptr, std::move(hash), std::move(mf)));
});
} else {
last_mf = {};
return do_for_each(x.get_mutation_fragments(), [this, &dk_ptr, &row_list, &last_mf, &cmp] (frozen_mutation_fragment& fmf) mutable {
_metrics.rx_row_nr += 1;
_metrics.rx_row_bytes += fmf.representation().size();
auto mf = make_lw_shared<mutation_fragment>(fmf.unfreeze(*_schema));
position_in_partition pos(mf->position());
// If the mutation_fragment has the same position as
// the last mutation_fragment, it means they are the
// same row with different contents. We can not feed
// such rows into the sstable writer. Instead we apply
// the mutation_fragment into the previous one.
if (last_mf && cmp(last_mf->position(), pos) == 0 && last_mf->mergeable_with(*mf)) {
last_mf->apply(*_schema, std::move(*mf));
} else {
last_mf = mf;
// On repair follower node, only decorated_key_with_hash and the mutation_fragment inside repair_row are used.
row_list.push_back(repair_row({}, {}, dk_ptr, {}, std::move(mf)));
}
});
}
}).then([&row_list] {
return std::move(row_list);
});
@@ -1084,29 +1153,28 @@ public:
// RPC API
future<>
repair_row_level_start(gms::inet_address remote_node, sstring ks_name, sstring cf_name, dht::token_range range) {
repair_row_level_start(gms::inet_address remote_node, sstring ks_name, sstring cf_name, dht::token_range range, table_schema_version schema_version) {
if (remote_node == _myip) {
return make_ready_future<>();
}
stats().rpc_call_nr++;
return netw::get_local_messaging_service().send_repair_row_level_start(msg_addr(remote_node),
_repair_meta_id, std::move(ks_name), std::move(cf_name), std::move(range), _algo, _max_row_buf_size, _seed,
_master_node_shard_config.shard, _master_node_shard_config.shard_count, _master_node_shard_config.ignore_msb, _master_node_shard_config.partitioner_name);
_master_node_shard_config.shard, _master_node_shard_config.shard_count, _master_node_shard_config.ignore_msb, _master_node_shard_config.partitioner_name, std::move(schema_version));
}
// RPC handler
static future<>
repair_row_level_start_handler(gms::inet_address from, uint32_t repair_meta_id, sstring ks_name, sstring cf_name,
repair_row_level_start_handler(gms::inet_address from, uint32_t src_cpu_id, uint32_t repair_meta_id, sstring ks_name, sstring cf_name,
dht::token_range range, row_level_diff_detect_algorithm algo, uint64_t max_row_buf_size,
uint64_t seed, shard_config master_node_shard_config) {
uint64_t seed, shard_config master_node_shard_config, table_schema_version schema_version) {
if (!_sys_dist_ks->local_is_initialized() || !_view_update_generator->local_is_initialized()) {
return make_exception_future<>(std::runtime_error(format("Node {} is not fully initialized for repair, try again later",
utils::fb_utilities::get_broadcast_address())));
}
rlogger.debug(">>> Started Row Level Repair (Follower): local={}, peers={}, repair_meta_id={}, keyspace={}, cf={}, range={}",
utils::fb_utilities::get_broadcast_address(), from, repair_meta_id, ks_name, cf_name, range);
insert_repair_meta(from, repair_meta_id, std::move(ks_name), std::move(cf_name), std::move(range), algo, max_row_buf_size, seed, std::move(master_node_shard_config));
return make_ready_future<>();
rlogger.debug(">>> Started Row Level Repair (Follower): local={}, peers={}, repair_meta_id={}, keyspace={}, cf={}, schema_version={}, range={}",
utils::fb_utilities::get_broadcast_address(), from, repair_meta_id, ks_name, cf_name, schema_version, range);
return insert_repair_meta(from, src_cpu_id, repair_meta_id, std::move(range), algo, max_row_buf_size, seed, std::move(master_node_shard_config), std::move(schema_version));
}
// RPC API
@@ -1313,14 +1381,15 @@ future<> repair_init_messaging_service_handler(repair_service& rs, distributed<d
});
ms.register_repair_row_level_start([] (const rpc::client_info& cinfo, uint32_t repair_meta_id, sstring ks_name,
sstring cf_name, dht::token_range range, row_level_diff_detect_algorithm algo, uint64_t max_row_buf_size, uint64_t seed,
unsigned remote_shard, unsigned remote_shard_count, unsigned remote_ignore_msb, sstring remote_partitioner_name) {
unsigned remote_shard, unsigned remote_shard_count, unsigned remote_ignore_msb, sstring remote_partitioner_name, table_schema_version schema_version) {
auto src_cpu_id = cinfo.retrieve_auxiliary<uint32_t>("src_cpu_id");
auto from = cinfo.retrieve_auxiliary<gms::inet_address>("baddr");
return smp::submit_to(src_cpu_id % smp::count, [from, repair_meta_id, ks_name, cf_name,
range, algo, max_row_buf_size, seed, remote_shard, remote_shard_count, remote_ignore_msb, remote_partitioner_name] () mutable {
return repair_meta::repair_row_level_start_handler(from, repair_meta_id, std::move(ks_name),
return smp::submit_to(src_cpu_id % smp::count, [from, src_cpu_id, repair_meta_id, ks_name, cf_name,
range, algo, max_row_buf_size, seed, remote_shard, remote_shard_count, remote_ignore_msb, remote_partitioner_name, schema_version] () mutable {
return repair_meta::repair_row_level_start_handler(from, src_cpu_id, repair_meta_id, std::move(ks_name),
std::move(cf_name), std::move(range), algo, max_row_buf_size, seed,
shard_config{remote_shard, remote_shard_count, remote_ignore_msb, std::move(remote_partitioner_name)});
shard_config{remote_shard, remote_shard_count, remote_ignore_msb, std::move(remote_partitioner_name)},
schema_version);
});
});
ms.register_repair_row_level_stop([] (const rpc::client_info& cinfo, uint32_t repair_meta_id,
@@ -1608,8 +1677,12 @@ public:
dht::global_partitioner().sharding_ignore_msb(),
dht::global_partitioner().name()
};
auto s = _cf.schema();
auto schema_version = s->version();
repair_meta master(_ri.db,
_cf,
s,
_range,
algorithm,
_max_row_buf_size,
@@ -1622,12 +1695,13 @@ public:
// All nodes including the node itself.
_all_nodes.insert(_all_nodes.begin(), master.myip());
rlogger.debug(">>> Started Row Level Repair (Master): local={}, peers={}, repair_meta_id={}, keyspace={}, cf={}, range={}, seed={}",
master.myip(), _all_live_peer_nodes, master.repair_meta_id(), _ri.keyspace, _cf_name, _range, _seed);
rlogger.debug(">>> Started Row Level Repair (Master): local={}, peers={}, repair_meta_id={}, keyspace={}, cf={}, schema_version={}, range={}, seed={}",
master.myip(), _all_live_peer_nodes, master.repair_meta_id(), _ri.keyspace, _cf_name, schema_version, _range, _seed);
try {
parallel_for_each(_all_nodes, [&, this] (const gms::inet_address& node) {
return master.repair_row_level_start(node, _ri.keyspace, _cf_name, _range).then([&] () {
return master.repair_row_level_start(node, _ri.keyspace, _cf_name, _range, schema_version).then([&] () {
return master.repair_get_estimated_partitions(node).then([this, node] (uint64_t partitions) {
rlogger.trace("Get repair_get_estimated_partitions for node={}, estimated_partitions={}", node, partitions);
_estimated_partitions += partitions;
@@ -1677,19 +1751,7 @@ public:
future<> repair_cf_range_row_level(repair_info& ri,
sstring cf_name, dht::token_range range,
const std::vector<gms::inet_address>& all_peer_nodes) {
auto all_live_peer_nodes = boost::copy_range<std::vector<gms::inet_address>>(all_peer_nodes |
boost::adaptors::filtered([] (const gms::inet_address& node) { return gms::get_local_gossiper().is_alive(node); }));
if (all_live_peer_nodes.size() != all_peer_nodes.size()) {
rlogger.warn("Repair for range={} is partial, peer nodes={}, live peer nodes={}",
range, all_peer_nodes, all_live_peer_nodes);
ri.nr_failed_ranges++;
}
if (all_live_peer_nodes.empty()) {
rlogger.info(">>> Skipped Row Level Repair (Master): local={}, peers={}, keyspace={}, cf={}, range={}",
utils::fb_utilities::get_broadcast_address(), all_peer_nodes, ri.keyspace, cf_name, range);
return make_ready_future<>();
}
return do_with(row_level_repair(ri, std::move(cf_name), std::move(range), std::move(all_live_peer_nodes)), [] (row_level_repair& repair) {
return do_with(row_level_repair(ri, std::move(cf_name), std::move(range), all_peer_nodes), [] (row_level_repair& repair) {
return repair.run();
});
}

View File

@@ -69,19 +69,30 @@ table_schema_version schema_mutations::digest() const {
}
md5_hasher h;
db::schema_tables::feed_hash_for_schema_digest(h, _columnfamilies);
db::schema_tables::feed_hash_for_schema_digest(h, _columns);
db::schema_features sf = db::schema_features::full();
// Disable this feature so that the digest remains compactible with Scylla
// versions prior to this feature.
// This digest affects the table schema version calculation and it's important
// that all nodes arrive at the same table schema version to avoid needless schema version
// pulls. Table schema versions are calculated on boot when we don't yet
// know all the cluster features, so we could get different table versions after reboot
// in an already upgraded cluster.
sf.remove<db::schema_feature::DIGEST_INSENSITIVE_TO_EXPIRY>();
db::schema_tables::feed_hash_for_schema_digest(h, _columnfamilies, sf);
db::schema_tables::feed_hash_for_schema_digest(h, _columns, sf);
if (_view_virtual_columns && !_view_virtual_columns->partition().empty()) {
db::schema_tables::feed_hash_for_schema_digest(h, *_view_virtual_columns);
db::schema_tables::feed_hash_for_schema_digest(h, *_view_virtual_columns, sf);
}
if (_indices && !_indices->partition().empty()) {
db::schema_tables::feed_hash_for_schema_digest(h, *_indices);
db::schema_tables::feed_hash_for_schema_digest(h, *_indices, sf);
}
if (_dropped_columns && !_dropped_columns->partition().empty()) {
db::schema_tables::feed_hash_for_schema_digest(h, *_dropped_columns);
db::schema_tables::feed_hash_for_schema_digest(h, *_dropped_columns, sf);
}
if (_scylla_tables) {
db::schema_tables::feed_hash_for_schema_digest(h, *_scylla_tables);
db::schema_tables::feed_hash_for_schema_digest(h, *_scylla_tables, sf);
}
return utils::UUID_gen::get_name_UUID(h.finalize());
}

View File

@@ -231,9 +231,15 @@ ar = tarfile.open(args.output, mode='w|gz')
pathlib.Path('build/SCYLLA-RELOCATABLE-FILE').touch()
ar.add('build/SCYLLA-RELOCATABLE-FILE', arcname='SCYLLA-RELOCATABLE-FILE')
ar.add('dist/redhat/python3')
ar.add('dist/debian/python3')
ar.add('build/python3/SCYLLA-RELEASE-FILE', arcname='SCYLLA-RELEASE-FILE')
ar.add('build/python3/SCYLLA-VERSION-FILE', arcname='SCYLLA-VERSION-FILE')
ar.add('build/SCYLLA-PRODUCT-FILE', arcname='SCYLLA-PRODUCT-FILE')
for p in ['pyhton3-libs'] + packages:
pdir = pathlib.Path('/usr/share/licenses/{}/'.format(p))
if pdir.exists():
for f in pdir.glob('*'):
ar.add(f, arcname='licenses/{}/{}'.format(p, f.name))
for f in file_list:
copy_file_to_python_env(ar, f)

View File

@@ -76,6 +76,9 @@ libs = {}
for exe in executables:
libs.update(ldd(exe))
# manually add libthread_db for debugging thread
libs.update({'libthread_db-1.0.so': '/lib64/libthread_db-1.0.so'})
ld_so = libs['ld.so']
have_gnutls = any([lib.startswith('libgnutls.so')

View File

@@ -34,7 +34,15 @@ class FilesystemFixup:
x="$(readlink -f "$0")"
b="$(basename "$x")"
d="$(dirname "$x")"
PYTHONPATH="${{d}}:${{d}}/libexec:$PYTHONPATH" PATH="${{d}}/{pythonpath}:${{PATH}}" exec -a "$0" "${{d}}/libexec/${{b}}" "$@"
CENTOS_SSL_CERT_FILE="/etc/pki/tls/cert.pem"
if [ -f "${{CENTOS_SSL_CERT_FILE}}" ]; then
c=${{CENTOS_SSL_CERT_FILE}}
fi
DEBIAN_SSL_CERT_FILE="/etc/ssl/certs/ca-certificates.crt"
if [ -f "${{DEBIAN_SSL_CERT_FILE}}" ]; then
c=${{DEBIAN_SSL_CERT_FILE}}
fi
PYTHONPATH="${{d}}:${{d}}/libexec:$PYTHONPATH" PATH="${{d}}/{pythonpath}:${{PATH}}" SSL_CERT_FILE="${{c}}" exec -a "$0" "${{d}}/libexec/${{b}}" "$@"
'''
self.python_path = python_path
self.installroot = installroot

Submodule seastar updated: 4cdccae53b...82637dcab4

View File

@@ -30,11 +30,24 @@ using namespace seastar;
namespace service {
class cache_hitrate_calculator : public seastar::async_sharded_service<cache_hitrate_calculator> {
struct stat {
float h = 0;
float m = 0;
stat& operator+=(stat& o) {
h += o.h;
m += o.m;
return *this;
}
};
seastar::sharded<database>& _db;
seastar::sharded<cache_hitrate_calculator>& _me;
timer<lowres_clock> _timer;
bool _stopped = false;
float _diff = 0;
std::unordered_map<utils::UUID, stat> _rates;
size_t _slen = 0;
std::string _gstate;
future<> _done = make_ready_future();
future<lowres_clock::duration> recalculate_hitrates();

View File

@@ -82,12 +82,16 @@ future<> migration_manager::stop()
void migration_manager::init_messaging_service()
{
auto& ss = service::get_local_storage_service();
_feature_listeners.push_back(ss.cluster_supports_view_virtual_columns().when_enabled([this, &ss] {
auto update_schema = [this, &ss] {
with_gate(_background_tasks, [this, &ss] {
mlogger.debug("view_virtual_columns feature enabled, recalculating schema version");
mlogger.debug("features changed, recalculating schema version");
return update_schema_version(get_storage_proxy(), ss.cluster_schema_features());
});
}));
};
_feature_listeners.push_back(ss.cluster_supports_view_virtual_columns().when_enabled(update_schema));
_feature_listeners.push_back(ss.cluster_supports_digest_insensitive_to_expiry().when_enabled(update_schema));
auto& ms = netw::get_local_messaging_service();
ms.register_definitions_update([this] (const rpc::client_info& cinfo, std::vector<frozen_mutation> m) {
@@ -992,4 +996,22 @@ future<schema_ptr> get_schema_for_write(table_schema_version v, netw::messaging_
});
}
future<> migration_manager::sync_schema(const database& db, const std::vector<gms::inet_address>& nodes) {
using schema_and_hosts = std::unordered_map<utils::UUID, std::vector<gms::inet_address>>;
return do_with(schema_and_hosts(), db.get_version(), [this, &nodes] (schema_and_hosts& schema_map, utils::UUID& my_version) {
return parallel_for_each(nodes, [this, &schema_map, &my_version] (const gms::inet_address& node) {
return netw::get_messaging_service().local().send_schema_check(netw::msg_addr(node)).then([node, &schema_map, &my_version] (utils::UUID remote_version) {
if (my_version != remote_version) {
schema_map[remote_version].emplace_back(node);
}
});
}).then([this, &schema_map] {
return parallel_for_each(schema_map, [this] (auto& x) {
mlogger.debug("Pulling schema {} from {}", x.first, x.second.front());
return submit_migration_task(x.second.front());
});
});
});
}
}

View File

@@ -75,6 +75,9 @@ public:
future<> submit_migration_task(const gms::inet_address& endpoint);
// Makes sure that this node knows about all schema changes known by "nodes" that were made prior to this call.
future<> sync_schema(const database& db, const std::vector<gms::inet_address>& nodes);
// Fetches schema from remote node and applies it locally.
// Differs from submit_migration_task() in that all errors are propagated.
// Coalesces requests.

View File

@@ -113,16 +113,6 @@ void cache_hitrate_calculator::run_on(size_t master, lowres_clock::duration d) {
}
future<lowres_clock::duration> cache_hitrate_calculator::recalculate_hitrates() {
struct stat {
float h = 0;
float m = 0;
stat& operator+=(stat& o) {
h += o.h;
m += o.m;
return *this;
}
};
auto non_system_filter = [&] (const std::pair<utils::UUID, lw_shared_ptr<column_family>>& cf) {
return _db.local().find_keyspace(cf.second->schema()->ks_name()).get_replication_strategy().get_type() != locator::replication_strategy_type::local;
};
@@ -144,15 +134,18 @@ future<lowres_clock::duration> cache_hitrate_calculator::recalculate_hitrates()
return _db.map_reduce0(cf_to_cache_hit_stats, std::unordered_map<utils::UUID, stat>(), sum_stats_per_cf).then([this, non_system_filter] (std::unordered_map<utils::UUID, stat> rates) mutable {
_diff = 0;
_gstate.reserve(_slen); // assume length did not change from previous iteration
_slen = 0;
_rates = std::move(rates);
// set calculated rates on all shards
return _db.invoke_on_all([this, rates = std::move(rates), cpuid = engine().cpu_id(), non_system_filter] (database& db) {
sstring gstate;
for (auto& cf : db.get_column_families() | boost::adaptors::filtered(non_system_filter)) {
auto it = rates.find(cf.first);
if (it == rates.end()) { // a table may be added before map/reduce compltes and this code runs
continue;
return _db.invoke_on_all([this, cpuid = engine().cpu_id(), non_system_filter] (database& db) {
return do_for_each(_rates, [this, cpuid, &db] (auto&& r) mutable {
auto it = db.get_column_families().find(r.first);
if (it == db.get_column_families().end()) { // a table may be added before map/reduce completes and this code runs
return;
}
stat s = it->second;
auto& cf = *it;
stat& s = r.second;
float rate = 0;
if (s.h) {
rate = s.h / (s.h + s.m);
@@ -160,24 +153,25 @@ future<lowres_clock::duration> cache_hitrate_calculator::recalculate_hitrates()
if (engine().cpu_id() == cpuid) {
// calculate max difference between old rate and new one for all cfs
_diff = std::max(_diff, std::abs(float(cf.second->get_global_cache_hit_rate()) - rate));
gstate += format("{}.{}:{:f};", cf.second->schema()->ks_name(), cf.second->schema()->cf_name(), rate);
_gstate += format("{}.{}:{:0.6f};", cf.second->schema()->ks_name(), cf.second->schema()->cf_name(), rate);
}
cf.second->set_global_cache_hit_rate(cache_temperature(rate));
}
if (gstate.size()) {
auto& g = gms::get_local_gossiper();
auto& ss = get_local_storage_service();
return g.add_local_application_state(gms::application_state::CACHE_HITRATES, ss.value_factory.cache_hitrates(std::move(gstate)));
}
return make_ready_future<>();
});
});
}).then([this] {
auto& g = gms::get_local_gossiper();
auto& ss = get_local_storage_service();
_slen = _gstate.size();
g.add_local_application_state(gms::application_state::CACHE_HITRATES, ss.value_factory.cache_hitrates(_gstate));
// if max difference during this round is big schedule next recalculate earlier
if (_diff < 0.01) {
return std::chrono::milliseconds(2000);
} else {
return std::chrono::milliseconds(500);
}
}).finally([this] {
_gstate = std::string(); // free memory, do not trust clear() to do that for string
_rates.clear();
});
}

View File

@@ -1097,6 +1097,22 @@ future<> storage_proxy::mutate_begin(std::vector<unique_response_handler> ids, d
std::optional<clock_type::time_point> timeout_opt) {
return parallel_for_each(ids, [this, cl, timeout_opt] (unique_response_handler& protected_response) {
auto response_id = protected_response.id;
// This function, mutate_begin(), is called after a preemption point
// so it's possible that other code besides our caller just ran. In
// particular, Scylla may have noticed that a remote node went down,
// called storage_proxy::on_down(), and removed some of the ongoing
// handlers, including this id. If this happens, we need to ignore
// this id - not try to look it up or start a send.
if (_response_handlers.find(response_id) == _response_handlers.end()) {
protected_response.release(); // Don't try to remove this id again
// Requests that time-out normally below after response_wait()
// result in an exception (see ~abstract_write_response_handler())
// However, here we no longer have the handler or its information
// to put in the exception. The exception is not needed for
// correctness (e.g., hints are written by timeout_cb(), not
// because of an exception here).
return make_exception_future<>(std::runtime_error("unstarted write cancelled"));
}
// it is better to send first and hint afterwards to reduce latency
// but request may complete before hint_to_dead_endpoints() is called and
// response_id handler will be removed, so we will have to do hint with separate

View File

@@ -110,6 +110,7 @@ static const sstring TRUNCATION_TABLE = "TRUNCATION_TABLE";
static const sstring CORRECT_STATIC_COMPACT_IN_MC = "CORRECT_STATIC_COMPACT_IN_MC";
static const sstring UNBOUNDED_RANGE_TOMBSTONES_FEATURE = "UNBOUNDED_RANGE_TOMBSTONES";
static const sstring VIEW_VIRTUAL_COLUMNS = "VIEW_VIRTUAL_COLUMNS";
static const sstring DIGEST_INSENSITIVE_TO_EXPIRY = "DIGEST_INSENSITIVE_TO_EXPIRY";
static const sstring SSTABLE_FORMAT_PARAM_NAME = "sstable_format";
@@ -162,6 +163,7 @@ storage_service::storage_service(distributed<database>& db, gms::gossiper& gossi
, _correct_static_compact_in_mc(_feature_service, CORRECT_STATIC_COMPACT_IN_MC)
, _unbounded_range_tombstones_feature(_feature_service, UNBOUNDED_RANGE_TOMBSTONES_FEATURE)
, _view_virtual_columns(_feature_service, VIEW_VIRTUAL_COLUMNS)
, _digest_insensitive_to_expiry(_feature_service, DIGEST_INSENSITIVE_TO_EXPIRY)
, _la_feature_listener(*this, _feature_listeners_sem, sstables::sstable_version_types::la)
, _mc_feature_listener(*this, _feature_listeners_sem, sstables::sstable_version_types::mc)
, _replicate_action([this] { return do_replicate_to_all_cores(); })
@@ -208,6 +210,7 @@ void storage_service::enable_all_features() {
std::ref(_correct_static_compact_in_mc),
std::ref(_unbounded_range_tombstones_feature),
std::ref(_view_virtual_columns),
std::ref(_digest_insensitive_to_expiry),
})
{
if (features.count(f.name())) {
@@ -311,6 +314,7 @@ std::set<sstring> storage_service::get_config_supported_features_set() {
TRUNCATION_TABLE,
CORRECT_STATIC_COMPACT_IN_MC,
VIEW_VIRTUAL_COLUMNS,
DIGEST_INSENSITIVE_TO_EXPIRY,
};
// Do not respect config in the case database is not started
@@ -3479,6 +3483,7 @@ void storage_service::notify_cql_change(inet_address endpoint, bool ready)
db::schema_features storage_service::cluster_schema_features() const {
db::schema_features f;
f.set_if<db::schema_feature::VIEW_VIRTUAL_COLUMNS>(bool(_view_virtual_columns));
f.set_if<db::schema_feature::DIGEST_INSENSITIVE_TO_EXPIRY>(bool(_digest_insensitive_to_expiry));
return f;
}

View File

@@ -323,6 +323,7 @@ private:
gms::feature _correct_static_compact_in_mc;
gms::feature _unbounded_range_tombstones_feature;
gms::feature _view_virtual_columns;
gms::feature _digest_insensitive_to_expiry;
sstables::sstable_version_types _sstables_format = sstables::sstable_version_types::ka;
seastar::semaphore _feature_listeners_sem = {1};
@@ -2338,6 +2339,9 @@ public:
const gms::feature& cluster_supports_view_virtual_columns() const {
return _view_virtual_columns;
}
const gms::feature& cluster_supports_digest_insensitive_to_expiry() const {
return _digest_insensitive_to_expiry;
}
// Returns schema features which all nodes in the cluster advertise as supported.
db::schema_features cluster_schema_features() const;
private:

View File

@@ -104,16 +104,6 @@ static bool belongs_to_current_node(const dht::token& t, const dht::token_range_
return false;
}
static void delete_sstables_for_interrupted_compaction(std::vector<shared_sstable>& new_sstables, sstring& ks, sstring& cf) {
// Delete either partially or fully written sstables of a compaction that
// was either stopped abruptly (e.g. out of disk space) or deliberately
// (e.g. nodetool stop COMPACTION).
for (auto& sst : new_sstables) {
clogger.debug("Deleting sstable {} of interrupted compaction for {}.{}", sst->get_filename(), ks, cf);
sst->mark_for_deletion();
}
}
static std::vector<shared_sstable> get_uncompacting_sstables(column_family& cf, std::vector<shared_sstable> sstables) {
auto all_sstables = boost::copy_range<std::vector<shared_sstable>>(*cf.get_sstables_including_compacted_undeleted());
boost::sort(all_sstables, [] (const shared_sstable& x, const shared_sstable& y) {
@@ -317,6 +307,9 @@ protected:
column_family& _cf;
schema_ptr _schema;
std::vector<shared_sstable> _sstables;
// Unused sstables are tracked because if compaction is interrupted we can only delete them.
// Deleting used sstables could potentially result in data loss.
std::vector<shared_sstable> _new_unused_sstables;
lw_shared_ptr<sstable_set> _compacting;
uint64_t _max_sstable_size;
uint32_t _sstable_level;
@@ -347,6 +340,7 @@ protected:
void setup_new_sstable(shared_sstable& sst) {
_info->new_sstables.push_back(sst);
_new_unused_sstables.push_back(sst);
sst->get_metadata_collector().set_replay_position(_rp);
sst->get_metadata_collector().sstable_level(_sstable_level);
for (auto ancestor : _ancestors) {
@@ -488,6 +482,16 @@ private:
const schema_ptr& schema() const {
return _schema;
}
void delete_sstables_for_interrupted_compaction() {
// Delete either partially or fully written sstables of a compaction that
// was either stopped abruptly (e.g. out of disk space) or deliberately
// (e.g. nodetool stop COMPACTION).
for (auto& sst : _new_unused_sstables) {
clogger.debug("Deleting sstable {} of interrupted compaction for {}.{}", sst->get_filename(), _info->ks_name, _info->cf_name);
sst->mark_for_deletion();
}
}
public:
static future<compaction_info> run(std::unique_ptr<compaction> c);
@@ -521,7 +525,6 @@ void compacting_sstable_writer::consume_end_of_stream() {
class regular_compaction : public compaction {
std::function<shared_sstable()> _creator;
replacer_fn _replacer;
std::vector<shared_sstable> _unreplaced_new_tables;
std::unordered_set<shared_sstable> _compacting_for_max_purgeable_func;
// store a clone of sstable set for column family, which needs to be alive for incremental selector.
sstable_set _set;
@@ -625,8 +628,6 @@ private:
}
void maybe_replace_exhausted_sstables() {
_unreplaced_new_tables.push_back(_sst);
// Replace exhausted sstable(s), if any, by new one(s) in the column family.
auto not_exhausted = [s = _schema, &dk = _sst->get_last_decorated_key()] (shared_sstable& sst) {
return sst->get_last_decorated_key().tri_compare(*s, dk) > 0;
@@ -668,7 +669,7 @@ private:
_compacting->erase(sst);
_monitor_generator.remove_sstable(_info->tracking, sst);
});
_replacer(std::vector<shared_sstable>(exhausted, _sstables.end()), std::move(_unreplaced_new_tables));
_replacer(std::vector<shared_sstable>(exhausted, _sstables.end()), std::move(_new_unused_sstables));
_sstables.erase(exhausted, _sstables.end());
}
}
@@ -677,7 +678,7 @@ private:
if (!_sstables.empty()) {
std::vector<shared_sstable> sstables_compacted;
std::move(_sstables.begin(), _sstables.end(), std::back_inserter(sstables_compacted));
_replacer(std::move(sstables_compacted), std::move(_unreplaced_new_tables));
_replacer(std::move(sstables_compacted), std::move(_new_unused_sstables));
}
}
@@ -877,7 +878,7 @@ future<compaction_info> compaction::run(std::unique_ptr<compaction> c) {
auto r = std::move(reader);
r.consume_in_thread(std::move(cfc), c->filter_func(), db::no_timeout);
} catch (...) {
delete_sstables_for_interrupted_compaction(c->_info->new_sstables, c->_info->ks_name, c->_info->cf_name);
c->delete_sstables_for_interrupted_compaction();
c = nullptr; // make sure writers are stopped while running in thread context
throw;
}

View File

@@ -479,11 +479,6 @@ public:
auto itw = writes_per_window.find(bound);
if (itw != writes_per_window.end()) {
ow_this_window = &itw->second;
// We will erase here so we can keep track of which
// writes belong to existing windows. Writes that don't belong to any window
// are writes in progress to new windows and will be accounted in the final
// loop before we return
writes_per_window.erase(itw);
}
auto* oc_this_window = &no_oc;
auto itc = compactions_per_window.find(bound);
@@ -491,6 +486,13 @@ public:
oc_this_window = &itc->second;
}
b += windows.second.backlog(*ow_this_window, *oc_this_window);
if (itw != writes_per_window.end()) {
// We will erase here so we can keep track of which
// writes belong to existing windows. Writes that don't belong to any window
// are writes in progress to new windows and will be accounted in the final
// loop before we return
writes_per_window.erase(itw);
}
}
// Partial writes that don't belong to any window are accounted here.

View File

@@ -533,7 +533,7 @@ private:
shard_id _shard; // Specifies which shard the new SStable will belong to.
bool _compression_enabled = false;
std::unique_ptr<file_writer> _data_writer;
std::optional<file_writer> _index_writer;
std::unique_ptr<file_writer> _index_writer;
bool _tombstone_written = false;
bool _static_row_written = false;
// The length of partition header (partition key, partition deletion and static row, if present)
@@ -592,6 +592,10 @@ private:
bool _write_regular_as_static; // See #4139
void init_file_writers();
// Returns the closed writer
std::unique_ptr<file_writer> close_writer(std::unique_ptr<file_writer>& w);
void close_data_writer();
void ensure_tombstone_is_written() {
if (!_tombstone_written) {
@@ -676,7 +680,7 @@ private:
// Writes single atomic cell
void write_cell(bytes_ostream& writer, const clustering_key_prefix* clustering_key, atomic_cell_view cell, const column_definition& cdef,
const row_time_properties& properties, bytes_view cell_path = {});
const row_time_properties& properties, std::optional<bytes_view> cell_path = {});
// Writes information about row liveness (formerly 'row marker')
void write_liveness_info(bytes_ostream& writer, const row_marker& marker);
@@ -836,13 +840,17 @@ void writer::init_file_writers() {
&_sst._components->compression,
_schema.get_compressor_params()));
}
_index_writer.emplace(std::move(_sst._index_file), options);
_index_writer = std::make_unique<file_writer>(std::move(_sst._index_file), options);
}
std::unique_ptr<file_writer> writer::close_writer(std::unique_ptr<file_writer>& w) {
auto writer = std::move(w);
writer->close();
return writer;
}
void writer::close_data_writer() {
auto writer = std::move(_data_writer);
writer->close();
auto writer = close_writer(_data_writer);
if (!_compression_enabled) {
auto chksum_wr = static_cast<crc32_checksummed_file_writer*>(writer.get());
_sst.write_digest(chksum_wr->full_checksum());
@@ -970,7 +978,7 @@ void writer::consume(tombstone t) {
}
void writer::write_cell(bytes_ostream& writer, const clustering_key_prefix* clustering_key, atomic_cell_view cell,
const column_definition& cdef, const row_time_properties& properties, bytes_view cell_path) {
const column_definition& cdef, const row_time_properties& properties, std::optional<bytes_view> cell_path) {
uint64_t current_pos = writer.size();
bool is_deleted = !cell.is_live();
@@ -983,7 +991,7 @@ void writer::write_cell(bytes_ostream& writer, const clustering_key_prefix* clus
properties.local_deletion_time == cell.deletion_time();
cell_flags flags = cell_flags::none;
if (!has_value) {
if ((!has_value && !cdef.is_counter()) || is_deleted) {
flags |= cell_flags::has_empty_value_mask;
}
if (is_deleted) {
@@ -1012,20 +1020,22 @@ void writer::write_cell(bytes_ostream& writer, const clustering_key_prefix* clus
}
}
if (!cell_path.empty()) {
write_vint(writer, cell_path.size());
write(_sst.get_version(), writer, cell_path);
if (bool(cell_path)) {
write_vint(writer, cell_path->size());
write(_sst.get_version(), writer, *cell_path);
}
if (has_value) {
if (cdef.is_counter()) {
if (cdef.is_counter()) {
if (!is_deleted) {
assert(!cell.is_counter_update());
counter_cell_view::with_linearized(cell, [&] (counter_cell_view ccv) {
write_counter_value(ccv, writer, sstable_version_types::mc, [] (bytes_ostream& out, uint32_t value) {
return write_vint(out, value);
});
});
} else {
}
} else {
if (has_value) {
write_cell_value(writer, *cdef.type, cell.value());
}
}
@@ -1382,8 +1392,7 @@ void writer::consume_end_of_stream() {
_sst.get_metadata_collector().add_compression_ratio(_sst._components->compression.compressed_file_length(), _sst._components->compression.uncompressed_file_length());
}
_index_writer->close();
_index_writer.reset();
close_writer(_index_writer);
_sst.set_first_and_last_keys();
_sst._components->statistics.contents[metadata_type::Serialization] = std::make_unique<serialization_header>(std::move(_sst_schema.header));

View File

@@ -44,6 +44,14 @@ namespace sstables {
atomic_cell make_counter_cell(api::timestamp_type timestamp, bytes_view value) {
static constexpr size_t shard_size = 32;
if (value.empty()) {
// This will never happen in a correct MC sstable but
// we had a bug #4363 that caused empty counters
// to be incorrectly stored inside sstables.
counter_cell_builder ccb;
return ccb.build(timestamp);
}
data_input in(value);
auto header_size = in.read<int16_t>();
@@ -59,8 +67,6 @@ atomic_cell make_counter_cell(api::timestamp_type timestamp, bytes_view value) {
throw marshal_exception("encountered remote shards in a counter cell");
}
std::vector<counter_shard> shards;
shards.reserve(shard_count);
counter_cell_builder ccb(shard_count);
for (auto i = 0u; i < shard_count; i++) {
auto id_hi = in.read<int64_t>();

View File

@@ -3146,7 +3146,7 @@ sstable::unlink()
});
name = get_filename();
auto update_large_data_fut = get_large_data_handler().maybe_delete_large_data_entries(*get_schema(), std::move(name), data_size())
auto update_large_data_fut = get_large_data_handler().maybe_delete_large_data_entries(*get_schema(), name, data_size())
.then_wrapped([name = std::move(name)] (future<> f) {
if (f.failed()) {
// Just log and ignore failures to delete large data entries.

View File

@@ -458,7 +458,8 @@ enum sstable_feature : uint8_t {
NonCompoundRangeTombstones = 1, // See #2986
ShadowableTombstones = 2, // See #3885
CorrectStaticCompact = 3, // See #4139
End = 4,
CorrectEmptyCounters = 4, // See #4363
End = 5,
};
// Scylla-specific features enabled for a particular sstable.

View File

@@ -105,6 +105,21 @@ struct send_info {
, prs(to_partition_ranges(ranges))
, reader(cf.make_streaming_reader(cf.schema(), prs)) {
}
future<bool> has_relevant_range_on_this_shard() {
return do_with(false, [this] (bool& found_relevant_range) {
return do_for_each(ranges, [this, &found_relevant_range] (dht::token_range range) {
if (!found_relevant_range) {
auto sharder = dht::selective_token_range_sharder(range, engine().cpu_id());
auto range_shard = sharder.next();
if (range_shard) {
found_relevant_range = true;
}
}
}).then([&found_relevant_range] {
return found_relevant_range;
});
});
}
future<size_t> estimate_partitions() {
return do_with(cf.get_sstables(), size_t(0), [this] (auto& sstables, size_t& partition_count) {
return do_for_each(*sstables, [this, &partition_count] (auto& sst) {
@@ -222,11 +237,18 @@ future<> stream_transfer_task::execute() {
auto reason = session->get_reason();
return session->get_db().invoke_on_all([plan_id, cf_id, id, dst_cpu_id, ranges=this->_ranges, streaming_with_rpc_stream, reason] (database& db) {
auto si = make_lw_shared<send_info>(db, plan_id, cf_id, std::move(ranges), id, dst_cpu_id, reason);
if (streaming_with_rpc_stream) {
return send_mutation_fragments(std::move(si));
} else {
return send_mutations(std::move(si));
}
return si->has_relevant_range_on_this_shard().then([si, plan_id, cf_id, streaming_with_rpc_stream] (bool has_relevant_range_on_this_shard) {
if (!has_relevant_range_on_this_shard) {
sslog.debug("[Stream #{}] stream_transfer_task: cf_id={}: ignore ranges on shard={}",
plan_id, cf_id, engine().cpu_id());
return make_ready_future<>();
}
if (streaming_with_rpc_stream) {
return send_mutation_fragments(std::move(si));
} else {
return send_mutations(std::move(si));
}
});
}).then([this, plan_id, cf_id, id, streaming_with_rpc_stream] {
sslog.debug("[Stream #{}] SEND STREAM_MUTATION_DONE to {}, cf_id={}", plan_id, id, cf_id);
return session->ms().send_stream_mutation_done(id, plan_id, _ranges,

View File

@@ -510,8 +510,8 @@ table::make_streaming_reader(schema_ptr s,
return make_flat_multi_range_reader(s, std::move(source), ranges, slice, pc, nullptr, mutation_reader::forwarding::no);
}
flat_mutation_reader table::make_streaming_reader(schema_ptr schema, const dht::partition_range& range, mutation_reader::forwarding fwd_mr) const {
const auto& slice = schema->full_slice();
flat_mutation_reader table::make_streaming_reader(schema_ptr schema, const dht::partition_range& range,
const query::partition_slice& slice, mutation_reader::forwarding fwd_mr) const {
const auto& pc = service::get_local_streaming_read_priority();
auto trace_state = tracing::trace_state_ptr();
const auto fwd = streamed_mutation::forwarding::no;
@@ -1029,6 +1029,7 @@ table::reshuffle_sstables(std::set<int64_t> all_generations, int64_t start) {
};
return do_with(work(start, std::move(all_generations)), [this] (work& work) {
tlogger.info("Reshuffling SSTables in {}...", _config.datadir);
return lister::scan_dir(_config.datadir, { directory_entry_type::regular }, [this, &work] (fs::path parent_dir, directory_entry de) {
auto comps = sstables::entry_descriptor::make_descriptor(parent_dir.native(), de.name);
if (comps.component != component_type::TOC) {
@@ -1956,6 +1957,8 @@ future<int64_t>
table::disable_sstable_write() {
_sstable_writes_disabled_at = std::chrono::steady_clock::now();
return _sstables_lock.write_lock().then([this] {
// _sstable_deletion_sem must be acquired after _sstables_lock.write_lock
return _sstable_deletion_sem.wait().then([this] {
if (_sstables->all()->empty()) {
return make_ready_future<int64_t>(0);
}
@@ -1964,9 +1967,18 @@ table::disable_sstable_write() {
max = std::max(max, s->generation());
}
return make_ready_future<int64_t>(max);
});
});
}
std::chrono::steady_clock::duration table::enable_sstable_write(int64_t new_generation) {
if (new_generation != -1) {
update_sstables_known_generation(new_generation);
}
_sstable_deletion_sem.signal();
_sstables_lock.write_unlock();
return std::chrono::steady_clock::now() - _sstable_writes_disabled_at;
}
void table::set_schema(schema_ptr s) {
tlogger.debug("Changing schema version of {}.{} ({}) from {} to {}",

View File

@@ -178,3 +178,14 @@ rows_assertions rows_assertions::with_serialized_columns_count(size_t columns_co
}
return {*this};
}
shared_ptr<cql_transport::messages::result_message> cquery_nofail(
cql_test_env& env, const char* query, const std::experimental::source_location& loc) {
try {
return env.execute_cql(query).get0();
} catch (...) {
BOOST_FAIL(format("query '{}' failed: {}\n{}:{}: originally from here",
query, std::current_exception(), loc.file_name(), loc.line()));
}
return shared_ptr<cql_transport::messages::result_message>(nullptr);
}

View File

@@ -22,8 +22,10 @@
#pragma once
#include "tests/cql_test_env.hh"
#include "transport/messages/result_message_base.hh"
#include "bytes.hh"
#include <experimental/source_location>
#include <seastar/core/shared_ptr.hh>
#include <seastar/core/future.hh>
@@ -76,3 +78,12 @@ void assert_that_failed(future<T...>&& f)
catch (...) {
}
}
/// Invokes env.execute_cql(query), awaits its result, and returns it. If an exception is thrown,
/// invokes BOOST_FAIL with useful diagnostics.
///
/// \note Should be called from a seastar::thread context, as it awaits the CQL result.
shared_ptr<cql_transport::messages::result_message> cquery_nofail(
cql_test_env& env,
const char* query,
const std::experimental::source_location& loc = std::experimental::source_location::current());

View File

@@ -2481,8 +2481,8 @@ SEASTAR_TEST_CASE(test_in_restriction) {
return e.execute_cql("select r1 from tir2 where (c1,r1) in ((0, 1),(1,2),(0,1),(1,2),(3,3)) ALLOW FILTERING;");
}).then([&e] (shared_ptr<cql_transport::messages::result_message> msg) {
assert_that(msg).is_rows().with_rows({
{int32_type->decompose(1)},
{int32_type->decompose(2)},
{int32_type->decompose(1), int32_type->decompose(0)},
{int32_type->decompose(2), int32_type->decompose(1)},
});
}).then([&e] {
return e.prepare("select r1 from tir2 where (c1,r1) in ? ALLOW FILTERING;");
@@ -2507,8 +2507,8 @@ SEASTAR_TEST_CASE(test_in_restriction) {
return e.execute_prepared(prepared_id,raw_values);
}).then([&e] (shared_ptr<cql_transport::messages::result_message> msg) {
assert_that(msg).is_rows().with_rows({
{int32_type->decompose(1)},
{int32_type->decompose(2)},
{int32_type->decompose(1), int32_type->decompose(0)},
{int32_type->decompose(2), int32_type->decompose(1)},
});
});
});

View File

@@ -791,6 +791,13 @@ SEASTAR_TEST_CASE(test_allow_filtering_with_secondary_index) {
}
});
});
eventually([&] {
auto msg = cquery_nofail(e, "SELECT SUM(e) FROM t WHERE c = 5 AND b = 911 ALLOW FILTERING;");
assert_that(msg).is_rows().with_rows({{ int32_type->decompose(0), {} }});
msg = cquery_nofail(e, "SELECT e FROM t WHERE c = 5 AND b = 3 ALLOW FILTERING;");
assert_that(msg).is_rows().with_rows({{ int32_type->decompose(9), int32_type->decompose(3) }});
});
});
}

View File

@@ -184,6 +184,7 @@ using stats_values = std::tuple<
double, // frags_per_second mad
double, // frags_per_second max
double, // frags_per_second min
double, // average aio
uint64_t, // aio
uint64_t, // kb
uint64_t, // blocked
@@ -223,6 +224,7 @@ std::array<sstring, std::tuple_size<stats_values>::value> stats_formats =
"{:.0f}",
"{:.0f}",
"{:.0f}",
"{:.1f}", // average aio
"{}",
"{}",
"{}",
@@ -573,6 +575,7 @@ private:
double _fragment_rate_mad = 0;
double _fragment_rate_max = 0;
double _fragment_rate_min = 0;
double _average_aio_operations = 0;
sstring_vec _params;
std::optional<sstring> _error;
public:
@@ -602,6 +605,9 @@ public:
_fragment_rate_min = min;
}
double average_aio_operations() const { return _average_aio_operations; }
void set_average_aio_operations(double aio) { _average_aio_operations = aio; }
uint64_t aio_reads() const { return after.io.aio_reads - before.io.aio_reads; }
uint64_t aio_read_bytes() const { return after.io.aio_read_bytes - before.io.aio_read_bytes; }
uint64_t aio_writes() const { return after.io.aio_writes - before.io.aio_writes; }
@@ -632,6 +638,7 @@ public:
{"mad f/s", "{:>10}"},
{"max f/s", "{:>10}"},
{"min f/s", "{:>10}"},
{"avg aio", "{:>10}"},
{"aio", "{:>6}"},
{"(KiB)", "{:>10}"},
{"blocked", "{:>7}"},
@@ -661,6 +668,7 @@ public:
fragment_rate_mad(),
fragment_rate_max(),
fragment_rate_min(),
average_aio_operations(),
aio_reads() + aio_writes(),
(aio_read_bytes() + aio_written_bytes()) / 1024,
reads_blocked(),
@@ -1138,6 +1146,11 @@ public:
void done() {
for (auto&& result : results) {
print_all(result);
auto average_aio = boost::accumulate(result, 0., [&] (double a, const test_result& b) {
return a + b.aio_reads() + b.aio_writes();
}) / (result.empty() ? 1 : result.size());
boost::sort(result, [] (const test_result& a, const test_result& b) {
return a.fragment_rate() < b.fragment_rate();
});
@@ -1153,6 +1166,7 @@ public:
auto fragment_rate_mad = deviation[deviation.size() / 2];
median.set_fragment_rate_stats(fragment_rate_mad, fragment_rate_max, fragment_rate_min);
median.set_iteration_count(result.size());
median.set_average_aio_operations(average_aio);
print(median);
}
}

View File

@@ -34,6 +34,8 @@
#include "schema_registry.hh"
#include "types/list.hh"
#include "types/user.hh"
#include "db/config.hh"
#include "tmpdir.hh"
SEASTAR_TEST_CASE(test_new_schema_with_no_structural_change_is_propagated) {
return do_with_cql_env([](cql_test_env& e) {
@@ -505,3 +507,83 @@ SEASTAR_TEST_CASE(test_prepared_statement_is_invalidated_by_schema_change) {
});
});
}
// We don't want schema digest to change between Scylla versions because that results in a schema disagreement
// during rolling upgrade.
SEASTAR_TEST_CASE(test_schema_digest_does_not_change) {
using namespace db;
using namespace db::schema_tables;
auto tmp = tmpdir();
const bool regenerate = false;
sstring data_dir = "./tests/sstables/schema_digest_test";
db::config db_cfg;
if (regenerate) {
db_cfg.data_file_directories({data_dir}, db::config::config_source::CommandLine);
} else {
fs::copy(std::string(data_dir), std::string(tmp.path().string()), fs::copy_options::recursive);
db_cfg.data_file_directories({tmp.path().string()}, db::config::config_source::CommandLine);
}
return do_with_cql_env_thread([regenerate](cql_test_env& e) {
if (regenerate) {
// Exercise many different kinds of schema changes.
e.execute_cql(
"create keyspace tests with replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };").get();
e.execute_cql("create table tests.table1 (pk int primary key, c1 int, c2 int);").get();
e.execute_cql("create type tests.basic_info (c1 timestamp, v2 text);").get();
e.execute_cql("create index on tests.table1 (c1);").get();
e.execute_cql("create table ks.tbl (a int, b int, c float, PRIMARY KEY (a))").get();
e.execute_cql(
"create materialized view ks.tbl_view AS SELECT c FROM ks.tbl WHERE c IS NOT NULL PRIMARY KEY (c, a)").get();
e.execute_cql(
"create materialized view ks.tbl_view_2 AS SELECT a FROM ks.tbl WHERE a IS NOT NULL PRIMARY KEY (a)").get();
e.execute_cql(
"create keyspace tests2 with replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };").get();
e.execute_cql("drop keyspace tests2;").get();
}
auto expect_digest = [&] (schema_features sf, utils::UUID expected) {
auto actual = calculate_schema_digest(service::get_storage_proxy(), sf).get0();
if (regenerate) {
std::cout << "Digest is " << actual << "\n";
} else {
BOOST_REQUIRE_EQUAL(actual, expected);
}
};
auto expect_version = [&] (sstring ks_name, sstring cf_name, utils::UUID expected) {
auto actual = e.local_db().find_column_family(ks_name, cf_name).schema()->version();
if (regenerate) {
std::cout << "Version of " << ks_name << "." << cf_name << " is " << actual << "\n";
} else {
BOOST_REQUIRE_EQUAL(actual, expected);
}
};
schema_features sf = schema_features::of<schema_feature::DIGEST_INSENSITIVE_TO_EXPIRY>();
expect_digest(sf, utils::UUID("492719e5-0169-30b1-a15e-3447674c0c0c"));
sf.set<schema_feature::VIEW_VIRTUAL_COLUMNS>();
expect_digest(sf, utils::UUID("be3c0af4-417f-31d5-8e0e-4ac257ec00ad"));
expect_digest(schema_features::full(), utils::UUID("be3c0af4-417f-31d5-8e0e-4ac257ec00ad"));
// Causes tombstones to become expired
// This is in order to test that schema disagreement doesn't form due to expired tombstones being collected
// Refs https://github.com/scylladb/scylla/issues/4485
forward_jump_clocks(std::chrono::seconds(60*60*24*31));
expect_digest(schema_features::full(), utils::UUID("be3c0af4-417f-31d5-8e0e-4ac257ec00ad"));
// FIXME: schema_mutations::digest() is still sensitive to expiry, so we can check versions only after forward_jump_clocks()
// otherwise the results would not be stable.
expect_version("tests", "table1", utils::UUID("4198e26c-f214-3888-9c49-c396eb01b8d7"));
expect_version("ks", "tbl", utils::UUID("5c9cadec-e5df-357e-81d0-0261530af64b"));
expect_version("ks", "tbl_view", utils::UUID("1d91ad22-ea7c-3e7f-9557-87f0f3bb94d7"));
expect_version("ks", "tbl_view_2", utils::UUID("2dcd4a37-cbb5-399b-b3c9-8eb1398b096b"));
}, db_cfg).then([tmp = std::move(tmp)] {});
}

View File

@@ -484,11 +484,19 @@ SEASTAR_TEST_CASE(test_simple_index_paging) {
return ::make_shared<service::pager::paging_state>(*paging_state);
};
auto expect_more_pages = [] (::shared_ptr<cql_transport::messages::result_message> res, bool more_pages_expected) {
auto rows = dynamic_pointer_cast<cql_transport::messages::result_message::rows>(res);
if(more_pages_expected != rows->rs().get_metadata().flags().contains(cql3::metadata::flag::HAS_MORE_PAGES)) {
throw std::runtime_error(format("Expected {}more pages", more_pages_expected ? "" : "no "));
}
};
eventually([&] {
auto qo = std::make_unique<cql3::query_options>(db::consistency_level::LOCAL_ONE, infinite_timeout_config, std::vector<cql3::raw_value>{},
cql3::query_options::specific_options{1, nullptr, {}, api::new_timestamp()});
auto res = e.execute_cql("SELECT * FROM tab WHERE v = 1", std::move(qo)).get0();
auto paging_state = extract_paging_state(res);
expect_more_pages(res, true);
assert_that(res).is_rows().with_rows({{
{int32_type->decompose(3)}, {int32_type->decompose(2)}, {int32_type->decompose(1)},
@@ -497,6 +505,7 @@ SEASTAR_TEST_CASE(test_simple_index_paging) {
qo = std::make_unique<cql3::query_options>(db::consistency_level::LOCAL_ONE, infinite_timeout_config, std::vector<cql3::raw_value>{},
cql3::query_options::specific_options{1, paging_state, {}, api::new_timestamp()});
res = e.execute_cql("SELECT * FROM tab WHERE v = 1", std::move(qo)).get0();
expect_more_pages(res, true);
paging_state = extract_paging_state(res);
assert_that(res).is_rows().with_rows({{
@@ -506,10 +515,25 @@ SEASTAR_TEST_CASE(test_simple_index_paging) {
qo = std::make_unique<cql3::query_options>(db::consistency_level::LOCAL_ONE, infinite_timeout_config, std::vector<cql3::raw_value>{},
cql3::query_options::specific_options{1, paging_state, {}, api::new_timestamp()});
res = e.execute_cql("SELECT * FROM tab WHERE v = 1", std::move(qo)).get0();
paging_state = extract_paging_state(res);
assert_that(res).is_rows().with_rows({{
{int32_type->decompose(1)}, {int32_type->decompose(2)}, {int32_type->decompose(1)},
}});
// Due to implementation differences from origin (Scylla is allowed to return empty pages with has_more_pages == true
// and it's a legal operation), paging indexes may result in finding an additional empty page at the end of the results,
// but never more than one. This case used to be buggy (see #4569).
try {
expect_more_pages(res, false);
} catch (...) {
qo = std::make_unique<cql3::query_options>(db::consistency_level::LOCAL_ONE, infinite_timeout_config, std::vector<cql3::raw_value>{},
cql3::query_options::specific_options{1, paging_state, {}, api::new_timestamp()});
res = e.execute_cql("SELECT * FROM tab WHERE v = 1", std::move(qo)).get0();
assert_that(res).is_rows().with_size(0);
expect_more_pages(res, false);
}
});
eventually([&] {
@@ -1062,3 +1086,18 @@ SEASTAR_TEST_CASE(test_secondary_index_allow_some_column_drops) {
BOOST_REQUIRE_THROW(e.execute_cql("alter table cf2 drop d").get(), exceptions::invalid_request_exception);
});
}
// Reproduces issue #4539 - a partition key index should not influence a filtering decision for regular columns.
// Previously, given sequence resulted in a "No index found" error.
SEASTAR_TEST_CASE(test_secondary_index_on_partition_key_with_filtering) {
return do_with_cql_env_thread([] (cql_test_env& e) {
e.execute_cql("CREATE TABLE test_a(a int, b int, c int, PRIMARY KEY ((a, b)));").get();
e.execute_cql("CREATE INDEX ON test_a(a);").get();
e.execute_cql("INSERT INTO test_a (a, b, c) VALUES (1, 2, 3);").get();
eventually([&] {
auto res = e.execute_cql("SELECT * FROM test_a WHERE a = 1 AND b = 2 AND c = 3 ALLOW FILTERING;").get0();
assert_that(res).is_rows().with_rows({
{{int32_type->decompose(1)}, {int32_type->decompose(2)}, {int32_type->decompose(3)}}});
});
});
}

View File

@@ -3676,6 +3676,32 @@ SEASTAR_THREAD_TEST_CASE(test_write_missing_columns_large_set) {
validate_stats_metadata(s, written_sst, table_name);
}
SEASTAR_THREAD_TEST_CASE(test_write_empty_counter) {
auto abj = defer([] { await_background_jobs().get(); });
sstring table_name = "empty_counter";
// CREATE TABLE empty_counter (pk text, ck text, val counter, PRIMARY KEY (pk, ck)) WITH compression = {'sstable_compression': ''};
schema_builder builder("sst3", table_name);
builder.with_column("pk", utf8_type, column_kind::partition_key);
builder.with_column("ck", utf8_type, column_kind::clustering_key);
builder.with_column("val", counter_type);
builder.set_compressor_params(compression_parameters::no_compression());
schema_ptr s = builder.build(schema_builder::compact_storage::no);
lw_shared_ptr<memtable> mt = make_lw_shared<memtable>(s);
auto key = partition_key::from_exploded(*s, {to_bytes("key")});
mutation mut{s, key};
const column_definition& cdef = *s->get_column_definition("val");
auto ckey = clustering_key::from_exploded(*s, {to_bytes("ck")});
counter_cell_builder b;
mut.set_clustered_cell(ckey, cdef, b.build(write_timestamp));
mt->apply(mut);
tmpdir tmp = write_and_compare_sstables(s, mt, table_name);
validate_read(s, tmp.path(), {mut});
}
SEASTAR_THREAD_TEST_CASE(test_write_counter_table) {
auto abj = defer([] { await_background_jobs().get(); });
sstring table_name = "counter_table";

View File

@@ -0,0 +1 @@
4022779186

View File

@@ -0,0 +1,9 @@
Scylla.db
CRC.db
Filter.db
Statistics.db
TOC.txt
Digest.crc32
Index.db
Summary.db
Data.db

View File

@@ -0,0 +1,9 @@
Scylla.db
CompressionInfo.db
Filter.db
Statistics.db
TOC.txt
Digest.crc32
Index.db
Summary.db
Data.db

Some files were not shown because too many files have changed in this diff Show More