Commit Graph

23011 Commits

Author SHA1 Message Date
Calle Wilund
a978e043c3 alternator::streams: Do not allow enabling streams when CDC is off
Fixes #6866

If we try to create/alter an Alternator table to include streams,
we must check that the cluster does in fact support CDC
(experimental still). If not, throw a hopefully somewhat descriptive
error.
(Normal CQL table create goes through a similar check in cql_prop_defs)

Note: no other operations are prohibited. The cluster could have had CDC
enabled before, so streams could exist to list and even read.
Any tables loaded from schema tables should be reposnsible for their
own validation.
2020-08-03 21:01:31 +03:00
Calle Wilund
05851578d4 alternator::streams: Report streams as not ready until CDC stream id:s are available
Refs #6864

When booting a clean scylla, CDC stream ID:s will not be availble until
a n*ring delay time period has passed. Before this, writing to a CDC
enabled table will fail hard.
For alternator (and its tests), we can report the stream(s) for tables as not yet
available (ENABLING) until such time as id:s are
computed.

v2:
* Keep storage service ref in executor
2020-08-03 20:34:15 +03:00
Avi Kivity
1572b9e41c Merge 'transport: Added listener with port-based load balancing' from Juliusz
"
This is inspired by #6781. The idea is to make Scylla listen for CQL connections on port 9042 (where both old shard-aware and shard-unaware clients can still connect the traditional way). On top of that I added a new port, where everything works the same way, only the port from client's socket used to determine the shard No. to connect to. Desired shard No. is the result of `clientside_port % num_shards`.

The new port is configurable from scylla.yaml and defaults to 19042 (unencrypted, unless user configures encryption options and omits `native_shard_aware_transport_port_ssl` in DB config).

Two "SUPPORTED" tags are added: "SCYLLA_SHARD_AWARE_PORT" and "SCYLLA_SHARD_AWARE_PORT_SSL". For compatibility, "SCYLLA_SHARDING_ALGORITHM" is still kept.

Fixes #5239
"

* jul-stas-shard-aware-listener:
  docs: Info about shard-aware listeners in protocol-extensions
  transport: Added listener with port-based load balancing
2020-08-03 19:23:28 +03:00
Juliusz Stasiewicz
201268ea19 docs: Info about shard-aware listeners in protocol-extensions 2020-08-03 16:45:42 +02:00
Takuya ASADA
c0b2933106 scylla_setup: skip RAID prompt when var-lib-scylla.mount already exists
Since scylla_raid_setup always cause error when var-lib-scylla.mount already
exists, it's better to skip RAID prompt.

See #6965
2020-08-03 17:44:02 +03:00
Takuya ASADA
cff3e60f98 scylla_raid_setup: check var-lib-scylla.mount existance before formatting RAID
We should run var-lib-scyllla.mount existance check before formatting RAID.

Fixes #6965
2020-08-03 17:44:02 +03:00
Avi Kivity
4edfdfa78d Merge 'Build id cleanups' from Benny
"
Refs #5525

- main: add --build-id option
- build_id: mv sources to utils/
- build_id: throw on errors rather than assert
- build_id: simplify callback pointer type casting
"

* bhalevy-build-id-cleanups:
  build_id: simplify callback pointer type casting
  build_id: mv sources to utils/
  main: add --build-id option
2020-08-03 17:18:09 +03:00
Calle Wilund
30a700c5b0 system_keyspace: Remove support for legacy truncation records
Fixes #6341

Since scylla no longer supports upgrading from a version without the
"new" (dedicated) truncation record table, we can remove support for these
and the migtration thereof.

Make sure the above holds whereever this is committed.

Note that this does not  remove the "truncated_at" field in
system.local.
2020-08-03 17:16:26 +03:00
Botond Dénes
a9013030cf multishard_mutation_reader: add a trace message for each shard reader created
So we can see in the trace output, the shards that actually participated
in the reads. There is a single message for each shard reader.

Fixes: #6888
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200803132338.95013-1-bdenes@scylladb.com>
2020-08-03 16:24:46 +03:00
Benny Halevy
9256d2f504 build_id: simplify callback pointer type casting
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-08-03 15:55:18 +03:00
Benny Halevy
bf6e8f66d9 build_id: mv sources to utils/
The root directory is already overcrowded.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-08-03 15:55:16 +03:00
Benny Halevy
46f7d01536 main: add --build-id option
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-08-03 15:52:08 +03:00
Nadav Har'El
2dcb6294da merge: cdc: New delta modes: off, keys, fulll
Merged pull request https://github.com/scylladb/scylla/pull/6914
by By Juliusz Stasiewicz:

The goal is to have finer control over CDC "delta" rows, i.e.:

    disable them totally (mode off);
    record only base PK+CK columns (mode keys);
    make them behave as usual (mode full, default).

The editing of log rows is performed at the stage of finishing CDC mutation.

Fixes #6838

  tests: Added CQL test for `delta mode`
  cdc: Implementations of `delta_mode::off/keys`
  cdc: Infrastructure for controlling `delta_mode`
2020-08-03 14:10:15 +03:00
Piotr Sarna
ed829fade0 sstables: make abort handlers noexcept
Abort handlers are used in noexcept environment, so they should be
noexcept themselves.
Tested on a not-merged-yet Seastar patch with hardened noexcept
checks for abort_source.

Message-Id: <fbfd4950c0e8cc4f6005ad5b862d7bce01b90162.1596446857.git.sarna@scylladb.com>
2020-08-03 14:00:19 +03:00
Piotr Sarna
bd2d48e99c streaming: make stream_plan::abort noexcept
Aborting a stream plan is used in deinitialization code
ran in noexcept environment, so it should be noexcept itself.
Tested on a not-merged-yet Seastar patch with hardened noexcept
checks for abort_source.

Message-Id: <6eada033bb394d725b83a7e0f92381cb792ef6a1.1596446857.git.sarna@scylladb.com>
2020-08-03 14:00:19 +03:00
Piotr Sarna
5cc5b64d82 github: remove THE REST rule from CODEOWNERS file
The rule for THE REST results in each person listed in it
to receive notifications about every single pull request,
which can easily lead to inbox overload - the generic
rule is therefore dropped and authors of pull requests
are expected to manually add reviewers. GitHub offers
semi-random suggestions for reviewers anyway.

Message-Id: <3c0f7a2f13c098438a8abf998ec56b74db87c733.1596450426.git.sarna@scylladb.com>
2020-08-03 13:48:39 +03:00
Eliran Sinvani
779502ab11 Revert "schema: take into account features when converting a table creation to"
This reverts commit b97f466438.

It turns out that the schema mechanism has a lot of nuances,
after this change, for unknown reason, it was empirically
proven that the amount of cross shard on an upgraded node was
increased significantly with a steady stress traffic, if
was so significant that the node appeared unavailable to
the coordinators because all of the requests started to fail
on smp_srvice_group semaphore.

This revert will bring back a caveat in Scylla, the caveat is
that creating a table in a mixed cluster **might** under certain
condition cause schema mismatch on the newly created table, this
make the table essentially unusable until the whole cluster has
a uniform version (rolling upgrade or rollback completion).

Fixes #6893.
2020-08-03 12:51:16 +03:00
Botond Dénes
c81658c96e configure.py: remove unused variable do_sanitize
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200803082724.120916-1-bdenes@scylladb.com>
2020-08-03 12:51:16 +03:00
Botond Dénes
f4c8163d11 db/config_file.hh: named_value: remove unused members _name and _desc
They seem to be just copypasta.

Tests: unit(dev)
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200803080604.45595-1-bdenes@scylladb.com>
2020-08-03 12:51:16 +03:00
Benny Halevy
3fa0f289de table: snapshot: do not capture name
This captured sstring is unused.

Test: database_test(dev)
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20200803072258.44681-1-bhalevy@scylladb.com>
2020-08-03 12:51:16 +03:00
Botond Dénes
e4d06a3bbf scylla-gdb.py: collection_element: add circular_buffer support
Also add a __getitem__() to circular_buffer and mask indexes so they are
mapped to [`_impl.begin`, `_impl.end`).

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200803053646.14689-1-bdenes@scylladb.com>
2020-08-03 12:51:16 +03:00
Benny Halevy
122136c617 tables: snapshot: do not create links from multiple shards
We need only one of the shards owning each ssatble to call create_links.
This will allow us to simplify it and only handle crash/replay scenarios rather than rename/link/remove races.

Fixes #1622

Test: unit(dev), database_test(debug)
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20200803065505.42100-3-bhalevy@scylladb.com>
2020-08-03 10:07:07 +03:00
Benny Halevy
ec6e136819 table: snapshot: reduce copies of snapshot dir sstring
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20200803065505.42100-2-bhalevy@scylladb.com>
2020-08-03 10:07:06 +03:00
Benny Halevy
72365445c6 table: snapshot: create destination dir only once
No need to recursive_touch_directory for each sstable.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20200803065505.42100-1-bhalevy@scylladb.com>
2020-08-03 10:07:05 +03:00
Pekka Enberg
4f0f97773e configure.py: Use build directory variable
The "outdir" variable in configure.py and "$builddir" in build.ninja
file specifies the build directory. Let's use them to eliminate
hard-coded "build" paths from configure.py.
Message-Id: <20200731105113.388073-1-penberg@scylladb.com>
2020-08-03 09:51:51 +03:00
Nadav Har'El
ae25661d9c alternator test: set streams time window to zero
Alternator Streams have a "alternator_streams_time_window_s" parameter which
is used to allow for correct ordering in the stream in the face of clock
differences between Scylla nodes and possibly network delays. This parameter
currently defaults to 10 seconds, and there is a discussion on issue #6929
on whether it is perhaps too high. But in any case, for tests running on a
single node there is no reason not to set this parameter to zero.

Setting this parameter to zero greatly speeds up the Alternator Streams
tests which use ReadRecords to read from the stream. Previously each such
test took at least 10 seconds, because the data was only readable after a
10 second delay. With alternator_streams_time_window_s=0,  these tests can
finish in less than a second. Unfortunately they are still relatively slow
because our Streams implementation has 512 shards, and thus we need over a
thousand (!) API calls to read from the stream).

Running "test/alternator/run test_streams.py" with 25 tests took before
this patch 114 seconds, after this patch, it is down to 18 seconds.

Refs #6929

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Reviewed-by: Calle Wilund <calle@scylladb.com>
Message-Id: <20200728184612.1253178-1-nyh@scylladb.com>
2020-08-03 09:19:57 +03:00
Avi Kivity
257c17a87a Merge "Don't depend on seastar::make_(lw_)?shared idiosyncrasies" from Rafael
"
While working on another patch I was getting odd compiler errors
saying that a call to ::make_shared was ambiguous. The reason was that
seastar has both:

template <typename T, typename... A>
shared_ptr<T> make_shared(A&&... a);

template <typename T>
shared_ptr<T> make_shared(T&& a);

The second variant doesn't exist in std::make_shared.

This series drops the dependency in scylla, so that a future change
can make seastar::make_shared a bit more like std::make_shared.
"

* 'espindola/make_shared' of https://github.com/espindola/scylla:
  Everywhere: Explicitly instantiate make_lw_shared
  Everywhere: Add a make_shared_schema helper
  Everywhere: Explicitly instantiate make_shared
  cql3: Add a create_multi_column_relation helper
  main: Return a shared_ptr from defer_verbose_shutdown
2020-08-02 19:51:24 +03:00
Avi Kivity
bb9ad9c90b Merge 'Mount RAID volume correctly beyond reboot' from Takuya
"
To mount RAID volume correctly (#6876), we need to wait for MDRAID initialization.
To do so we need to add After=mdmonitor.service on var-lib-scylla.mount.
Also, `lsblk -n -oPARTTYPE {dev}` does not work for CentOS7, since older lsblk does not supported PARTTYPE column (#6954).
We need to provide relocatable lsblk and run it on out() / run() function instead of distribution provided version.
"

* syuu1228-scylla_raid_setup_mount_correctly_beyond_reboot:
  scylla_raid_setup: initialize MDRAID before mounting data volume
  create-relocatable-package.py: add lsblk for relocatable CLI tools
  scylla_util.py: always use relocatable CLI tools
2020-08-02 16:36:45 +03:00
Piotr Sarna
ccbffc3177 codeowners: add some @psarnas and @penbergs where applicable
I shamelessly added myself to some modules I usually take part
in reviewing. Also, I assume that the *THE REST* bucket should
show current maintainers, so the list is extended appropriately.

Message-Id: <0c172d0f20e367c3ce47fdf8d40755038ddee373.1596195689.git.sarna@scylladb.com>
2020-07-31 17:08:28 +03:00
Rafael Ávila de Espíndola
30722b8c8e logalloc: Add disable_failure_guard during a few tls variable initialization
The constructors of these global variables can allocate memory. Since
the variables are thread_local, they are initialized at first use.

There is nothing we can do if these allocations fail, so use
disable_failure_guard.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200729184901.205646-1-espindola@scylladb.com>
2020-07-31 15:49:21 +02:00
Pavel Emelyanov
14b279020b scylla-gdb.py: Support b+tree-based row_cache::_partitions
The row_cache::_partitions type is nowadays a double_decker which is B+tree of
intrusive_arrays of cache_entrys, so scylla cache command will raise an error
being unable to parse this new data type.

The respective iterator for double decker starts on the tree and walks the list
of leaf nodes, on each node it walks the plain array of data nodes, then on each
data node it walks the intrusive array of cache_entrys yielding them to the
caller.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20200730145851.8819-1-xemul@scylladb.com>
2020-07-31 15:48:25 +02:00
Piotr Jastrzębski
b16b2c348f Add CDC code owners 2020-07-31 14:22:08 +03:00
Piotr Jastrzębski
7eff7a39a0 Add hinted handoff code owners 2020-07-31 14:21:59 +03:00
Piotr Jastrzębski
443affa525 Update counters code owners 2020-07-31 14:21:48 +03:00
Juliusz Stasiewicz
1c11d8f4c4 transport: Added listener with port-based load balancing
The new port is configurable from scylla.yaml and defaults to 19042
(unencrypted, unless client configures encryption options and omits
`native_shard_aware_transport_port_ssl`).

Two "SUPPORTED" tags are added: "SCYLLA_SHARD_AWARE_PORT" and
"SCYLLA_SHARD_AWARE_PORT_SSL". For compatibility,
"SCYLLA_SHARDING_ALGORITHM" is still kept.

Fixes #5239
2020-07-31 13:02:13 +02:00
Tomasz Grabiec
5263e0453a CMakeLists.txt: Add abseil to include directories
Fixes IDE integration.
Message-Id: <1596190352-15467-1-git-send-email-tgrabiec@scylladb.com>
2020-07-31 12:15:23 +02:00
Avi Kivity
66c2b4c8bf tools: toolchain: regenerate for gcc 10.2
Fixes #6813.

As a side effect, this also brings in xxhash 0.7.4.
2020-07-31 08:32:16 +03:00
Takuya ASADA
9e5d548f75 scylla_raid_setup: initialize MDRAID before mounting data volume
var-lib-scylla.mount should wait for MDRAID initilization, so we need to add
'After=mdmonitor.service'.
However, currently mdmonitor.service fails to start due to no mail address
specified, we need to add the entry on mdadm.conf.

Fixes #6876
2020-07-31 06:33:52 +09:00
Takuya ASADA
6ba2a6c42e create-relocatable-package.py: add lsblk for relocatable CLI tools
We need latest version of lsblk that supported partition type UUID.

Fixes #6954
2020-07-31 04:23:03 +09:00
Takuya ASADA
a19a62e6f6 scylla_util.py: always use relocatable CLI tools
On some CLI tools, command options may different between latest version
vs older version.
To maximize compatibility of setup scripts, we should always use
relocatable CLI tools instead of distribution version of the tool.

Related #6954
2020-07-31 04:17:01 +09:00
Piotr Sarna
b3ad5042c4 .gitignore: add .vscode to the list
Since it looks like vscode is used as main IDE
by some developers, including me, let's ignore its helper files.

Message-Id: <63931cadc733c3d0345616be633a6479dc85ca19.1596115302.git.sarna@scylladb.com>
2020-07-30 16:35:06 +03:00
Piotr Sarna
8728c70628 .gitignore: allow symlinks when ignoring testlog
The .gitignore entry for testlog/ directory is generalized
from "testlog/*" to "testlog", in order to please everyone
who potentially wants test logs to use ramfs by symlinking
testlog to /tmp. Without the change, the symlink remains
visible in `git status`.

Message-Id: <e600f5954868aea7031beb02b1d8e12a2ff869e2.1596115302.git.sarna@scylladb.com>
2020-07-30 16:35:02 +03:00
Piotr Sarna
0788a77109 Merge 'Replace MAINTAINERS with CODEOWNERS' from Pekka
Replace the MAINTAINERS file with a CODEOWNERS file, which Github is
able to parse, and suggest reviewers for pull requests.

* penberg-penberg/codeowners:
  Replace MAINTAINERS with CODEOWNERS
  Update MAINTAINERS
2020-07-30 15:12:59 +02:00
Nadav Har'El
8b9da9c92a alternator test: tests for combination of query filter and projection
The tests in this patch, which pass on DynamoDB but fail on Alternator,
reproduce a bug described in issue #6951. This bug makes it impossible for
a Query (or Scan) to filter on an attribute if that attribute is not
requested to be included in the output.

This patch includes two xfailing tests of this type: One testing a
combination of FilterExpression and ProjectionExpression, and the second
testing a combination of QueryFilter and AttributesToGet; These two
pairs are, respectively, DynamoDB's newer and older syntaxes to achieve
the same thing.

Additionally, we add two xfailing tests that demonstrates that combining
old and new style syntax (e.g., FilterExpression with AttributesToGet)
should not have been allowed (DynamoDB doesn't allow such combinations),
but Alternator currently accepts these combinations.

Refs #6951

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200729210346.1308461-1-nyh@scylladb.com>
2020-07-30 09:34:23 +02:00
Rafael Ávila de Espíndola
a548e5f5d1 test: Mark tmpdir::remove noexcept
Also disable the allocation failure injection in it.

Refs #6831.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200729200019.250908-2-espindola@scylladb.com>
2020-07-30 09:55:52 +03:00
Rafael Ávila de Espíndola
d8ba9678b4 test: Move tmpdir code to a .cc file
This is not hot, so we can move it out of the header.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200729200019.250908-1-espindola@scylladb.com>
2020-07-30 09:55:52 +03:00
Tomasz Grabiec
3486eba1ce commitlog: Fix use-after-free on mutation object during replay
The mutation object may be freed prematurely during commitlog replay
in the schema upgrading path. We will hit the problem if the memtable
is full and apply_in_memory() needs to defer.

This will typically manifest as a segfault.

Fixes #6953

Introduced in 79935df

Tests:
  - manual using scylla binary. Reproduced the problem then verified the fix makes it go away

Message-Id: <1596044010-27296-1-git-send-email-tgrabiec@scylladb.com>
2020-07-29 20:58:15 +03:00
Juliusz Stasiewicz
7e42a42381 tests: Added CQL test for delta mode
Tested scenario is just a single insert in every `delta_mode`.
It is also checked that CDC cannot be enabled with all its
subfeatures disabled.
2020-07-29 16:42:26 +02:00
Nadav Har'El
665b78253a alternator test: reduce amount of Scylla logs saved
The test/alternator/run script follows the pytest log with a full log of
Scylla. This saved log can be useful in diagnosing problems, but most of
it is filled with non-useful "INFO"-level messages. The two biggest
offenders are compaction - which logs every single compaction happening,
and the migration manager, which is just a second (and very long) message
about schema change operations (e.g., table creations). Neither of these
are interesting for Alternator's tests, which shouldn't care exactly when
compaction of which sstable is happening. These two components alone
are reponsible for 80% of the log lines, and 90% of the log bytes!

In this patch we increase the log level of just these two components -
compaction and migration_manager - to WARN, which reduces the log
by the same percentages (80% by lines, 90% by bytes).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200728191420.1254961-1-nyh@scylladb.com>
2020-07-29 14:17:12 +03:00
Takuya ASADA
3a25e7285b scylla_post_install.sh: generate memory.conf for CentOS7
On CentOS7, systemd does not support percentage-based parameter.
To apply memory parameter on CentOS7, we need to override the parameter
in bytes, instead of percentage.

Fixes #6783
2020-07-29 14:10:16 +03:00