Commit Graph

20559 Commits

Author SHA1 Message Date
Gleb Natapov
d28dd4957b lwt: Process lwt request on a owning shard
LWT is much more efficient if a request is processed on a shard that owns
a token for the request. This is because otherwise the processing will
bounce to an owning shard multiple times. The patch proposes a way to
move request to correct shard before running lwt.  It works by returning
an error from lwt code if a shard is incorrect one specifying the shard
the request should be moved to. The error is processed by transport code
that jumps to a correct shard and re-process incoming message there.
2020-01-13 10:26:02 +02:00
Gleb Natapov
2832f1d9eb storage_service: move start_native_transport into a thread
The code runs only once and it is simple if it runs in a seastar thread.
2020-01-08 14:57:57 +02:00
Gleb Natapov
7fb2e8eb9f transport: change make_result to takes a reference to cql result instead of shared_ptr 2020-01-08 14:57:57 +02:00
Avi Kivity
6e0a073b2e mutation_partition: use type-aware printing of the clustering row
Now that position_in_partition_view has type-aware printing, use it
to provide a human readable version of clustering keys.
Message-Id: <20191231151315.602559-2-avi@scylladb.com>
2020-01-07 12:17:11 +01:00
Avi Kivity
488c42408a position_in_partition_view: add type-aware printer
If the position_in_partition_view represents a clustering key,
we can now see it with the clustering key decoded according to
the schema.
Message-Id: <20191231151315.602559-1-avi@scylladb.com>
2020-01-07 12:15:09 +01:00
Avi Kivity
3a3c20d337 schema_tables: de-templatize diff_table_or_view()
This reduces code bloat and makes the code friendlier for IDEs, as the
IDE now understands the type of create_schema.
Message-Id: <20191231134803.591190-1-avi@scylladb.com>
2020-01-07 11:56:54 +01:00
Avi Kivity
e5e42672f5 sstables: reduce bloat from sstables::write_simple()
sstables::write_simple() has quite a lot of boilerplate
which gets replicated into each template instance. Move
all of that into a non-template do_write_simple(), leaving
only things that truly depend on the component being written
in the template, and encapsulating them with a
noncopyable_function.

An explicit template instantiation was added, since this
is used in a header file. Before, it likely worked by
accident and stopped working when the template became
small enough to inline.

Tests: unit (dev)
Message-Id: <20200106135453.1634311-1-avi@scylladb.com>
2020-01-07 11:56:11 +01:00
Avi Kivity
8f7f56d6a0 schema_tables: make gratuitous generic lambda in create_tables_from_partitions() concrete
The generic lambda made IDE searches for create_table_from_table_row() fail.
Message-Id: <20191231135210.591972-1-avi@scylladb.com>
2020-01-07 11:49:10 +01:00
Avi Kivity
92fd83d3af schema_tables: make gratuitoous generic lambda in create_table_from_name() concrete
The lambda made IDE searches for read_table_mutations fail.
Message-Id: <20191231135103.591741-1-avi@scylladb.com>
2020-01-07 11:48:56 +01:00
Avi Kivity
dd6dd97df9 schema_tables: make gratuitous generic lambda in merge_tables_and_views() concrete
The generic lambda made IDE searches for create_table_from_mutations fail.
Message-Id: <20191231135059.591681-1-avi@scylladb.com>
2020-01-07 11:48:39 +01:00
Rafael Ávila de Espíndola
b80852c447 main: Explicitly allow scylla core dumps
I have not looked into the security reason for disabling it when
a program has file capabilities.

Fixes #5560

[avi: remove extraneous semicolon]
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200106231836.99052-1-espindola@scylladb.com>
2020-01-07 11:15:59 +02:00
Rafael Ávila de Espíndola
07f1cb53ea tests: run with ASAN_OPTIONS='disable_coredump=0:abort_on_error=1'
These are the same options we use in seastar.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200107001513.122238-1-espindola@scylladb.com>
2020-01-07 11:11:49 +02:00
Takuya ASADA
238a25a0f4 docker: fix typo of scylla-jmx script path (#5551)
The path should /opt/scylladb/jmx, not /opt/scylladb/scripts/jmx.

Fixes #5542
2020-01-07 10:54:16 +02:00
Asias He
401854dbaf repair: Avoid duplicated partition_end write
Consider this:

1) Write partition_start of p1
2) Write clustering_row of p1
3) Write partition_end of p1
4) Repair is stopped due to error before writing partition_start of p2
5) Repair calls repair_row_level_stop() to tear down which calls
   wait_for_writer_done(). A duplicate partition_end is written.

To fix, track the partition_start and partition_end written, avoid
unpaired writes.

Backports: 3.1 and 3.2
Fixes: #5527
2020-01-06 14:06:02 +02:00
Eliran Sinvani
e64445d7e5 debian-reloc: Propagate PRODUCT variable to renaming command in debian pkg
commit 21dec3881c introduced
a bug that will cause scylla debian build to fail. This is
because the commit relied on the environment PRODUCT variable
to be exported (and as a result, to propogate to the rename
command that is executed by find in a subshell)
This commit fixes it by explicitly passing the PRODUCT variable
into the rename command.

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
Message-Id: <20200106102229.24769-1-eliransin@scylladb.com>
2020-01-06 12:31:58 +02:00
Asias He
38d4015619 gossiper: Remove HIBERNATE status from dead state
In scylla, the replacing node is set as HIBERNATE status. It is the only
place we use HIBERNATE status. The replacing node is supposed to be
alive and updating its heartbeat, so it is not supposed to be in dead
state.

This patch fixes the following problem in replacing.

   1) start n1, n2
   2) n2 is down
   3) start n3 to replace n2, but kill n3 in the middle of the replace
   4) start n4 to replace n2

After step 3 and step 4, the old n3 will stay in gossip forever until a
full cluster shutdown. Note n3 will only stay in gossip but in
system.peers table. User will see the annoying and infinite logs like on
all the nodes

   rpc - client $ip_of_n3:7000: fail to connect: Connection refused

Fixes: #5449
Tests: replace_address_test.py + manual test
2020-01-06 11:47:31 +02:00
Amos Kong
c5ec1e3ddc scylla_ntp_setup: check redhat variant version by prase_version (#5434)
VERSION_ID of centos7 is "7", but VERSION_ID of oel7.7 is "7.7"
scylla_ntp_setup doesn't work on OEL7.7 for ValueError.

- ValueError: invalid literal for int() with base 10: '7.7'

This patch changed redhat_version() to return version string, and compare
with parse_version().

Fixes #5433

Signed-off-by: Amos Kong <amos@scylladb.com>
2020-01-06 11:43:14 +02:00
Asias He
145fd0313a streaming: Fix map access in stream_manager::get_progress
When the progress is queried, e.g., query from nodetool netstats
the progress info might not be updated yet.

Fix it by checking before access the map to avoid errors like:

std::out_of_range (_Map_base::at)

Fixes: #5437
Tests: nodetool_additional_test.py:TestNodetool.netstats_test
2020-01-06 10:31:15 +02:00
Rafael Ávila de Espíndola
98cd8eddeb tests: Run with halt_on_error=1:abort_on_error=1
This depends on the just emailed fixes to undefined behavior in
tests. With this change we should quickly notice if a change
introduces undefined behavior.

Fixes #4054

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>

Message-Id: <20191230222646.89628-1-espindola@scylladb.com>
2020-01-05 17:20:31 +02:00
Rafael Ávila de Espíndola
dc5ecc9630 enum_option_test: Add explicit underlying types to enums
We expect to be able to create variables with out of range values, so
these enums needs explicit underlying types.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200102173422.68704-1-espindola@scylladb.com>
2020-01-05 17:20:31 +02:00
Nadav Har'El
f0d8dd4094 merge: CDC rolling upgrade
Merged pull request https://github.com/scylladb/scylla/pull/5538 from
Avi Kivity and Piotr Jastrzębski.

This series prepares CDC for rolling upgrade. This consists of
reducing the footprint of cdc, when disabled, on the schema, adding
a cluster feature, and redacting the cdc column when transferring
it to other nodes. The latter is needed because we'll want to backport
this to 3.2, which doesn't have canonical_mutations yet.
2020-01-05 17:13:12 +02:00
Gleb Natapov
720c0aa285 commitlog: update last sync timestamp when cycle a buffer
If in memory buffer has not enough space for incoming mutation it is
written into a file, but the code missed updating timestamp of a last
sync, so we may sync to often.
Message-Id: <20200102155049.21291-9-gleb@scylladb.com>
2020-01-05 16:13:59 +02:00
Gleb Natapov
14746e4218 commitlog: drop segment gate
The code that enters the gate never defers before leaving, so the gate
behaves like a flag. Lets use existing flag to prohibit adding data to a
closed segment.
Message-Id: <20200102155049.21291-8-gleb@scylladb.com>
2020-01-05 16:13:59 +02:00
Gleb Natapov
f8c8a5bd1f test: fix error reporting in commitlog_test
Message-Id: <20200102155049.21291-7-gleb@scylladb.com>
2020-01-05 16:13:58 +02:00
Gleb Natapov
680330ae70 commitlog: introduce segment::close() function.
Currently segment closing code is spread over several functions and
activated based on the _closed flag. Make segment closing explicit
by moving all the code into close() function and call it where _closed
flag is set.
Message-Id: <20200102155049.21291-6-gleb@scylladb.com>
2020-01-05 16:13:55 +02:00
Gleb Natapov
a1ae08bb63 commitlog: remove unused segment::flush() parameter
Message-Id: <20200102155049.21291-5-gleb@scylladb.com>
2020-01-05 16:13:55 +02:00
Gleb Natapov
1e15e1ef44 commitlog: cleanup segment sync()
Call cycle() only once.
Message-Id: <20200102155049.21291-4-gleb@scylladb.com>
2020-01-05 16:13:54 +02:00
Gleb Natapov
3d3d2c572e commitlog: move segment shutdown code from sync()
Currently sync() does two completely different things based on the
shutdown parameter. Separate code into two different function.
Message-Id: <20200102155049.21291-3-gleb@scylladb.com>
2020-01-05 16:13:54 +02:00
Gleb Natapov
89afb92b28 commitlog: drop superfluous this
Message-Id: <20200102155049.21291-2-gleb@scylladb.com>
2020-01-05 16:13:53 +02:00
Piotr Jastrzebski
95feeece0b scylla_tables: treat empty cdc props as disabled
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-01-05 14:39:23 +02:00
Piotr Jastrzebski
396e35bf20 cdc: add schema_change test for cdc_options
The original "test_schema_digest_does_not_change" test case ensures
that schema digests will match for older nodes that do not support
all the features yet (including computed columns).
The additional case uses sstables generated after CDC was enabled
and a table with CDC enabled is created,
in order to make sure that the digest computed
including CDC column does not change spuriously as well.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-01-05 14:39:23 +02:00
Piotr Jastrzebski
c08e6985cd cdc: allow cluster rolling upgrade
Addition of cdc column in scylla_tables changes how schema
digests are calculated, and affect the ABI of schema update
messages (adding a column changes other columns' indexes
in frozen_mutation).

To fix this, extend the schema_tables mechanism with support
for the cdc column, and adjust schemas and mutations to remove
that column when sending schemas during upgrade.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-01-05 14:39:23 +02:00
Piotr Jastrzebski
caa0a4e154 tests: disable CDC in schema_change_tests
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-01-05 14:39:23 +02:00
Piotr Jastrzebski
129af99b94 cdc: Return reference from cluster_supports_cdc
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-01-05 14:39:23 +02:00
Piotr Jastrzebski
4639989964 cdc: Add CDC_OPTIONS schema_feature
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-01-05 14:39:23 +02:00
Avi Kivity
c150f2e5d7 schema_tables, cdc: don't store empty cdc columns in scylla_tables
An empty cdc column in scylla_tables is hashed differently from
a missing column. This causes schema mismatch when a schema is
propagated to another node, because the other node will redact
the schema column completely if the cluster feature isn't enabled,
and an empty value is hashed differently from a missing value.

Store a tombstone instead. Tombstones are removed before
digesting, so they don't affect the outcome.

This change also undoes the changes in 386221da84 ("schema_tables:
 handle 'cdc' options") to schema_change_test
test_merging_does_not_alter_tables_which_didnt_change. That change
enshrined the breakage into the test, instead of fixing the root cause,
which was that we added an an extra mutation to the schema (for
cdc options, which were disabled).
2020-01-05 14:36:18 +02:00
Rafael Ávila de Espíndola
3d641d4062 lua: Use existing cpp_int cast logic
Different versions of boost have different rules for what conversions
from cpp_int to smaller intergers are allowed.

We already had a function that worked with all supported versions, but
it was not being use by lua.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200104041028.215153-1-espindola@scylladb.com>
2020-01-05 12:10:54 +02:00
Rafael Ávila de Espíndola
88b5aadb05 tests: cql_test_env: wait for two futures starting internal services
I noticed this while looking at the crashes next is currently
experiencing.

While I have no idea if this fixes the issue, it does avoid broken
future warnings (for no_sharded_instance_exception) in a debug build.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200103201540.65324-1-espindola@scylladb.com>
2020-01-05 12:09:59 +02:00
Avi Kivity
4b8e2f5003 Update seastar submodule
* seastar 0525bbb08...36cf5c5ff (6):
  > memcached: Fix use after free in shutdown
  > Revert "task: stop wrapping tasks with unique_ptr"
  > task: stop wrapping tasks with unique_ptr
  > http: Change exception formating to the generic seastar one
  > Merge "Avoid a few calls to ~exception_ptr" from Rafael
  > tests: fix core generation with asan
2020-01-03 15:48:53 +02:00
Nadav Har'El
44c2a44b54 alternator-test: test for ConditionExpression feature
This patch adds a very comprehensive test for the ConditionExpression
feature, i.e., the newer syntax of conditional writes replacing
the old-style "Expected" - for the UpdateItem, PutItem and DeleteItem
operations.

I wrote these tests while closely following the DynamoDB ConditionExpression
documentation, and attempted to cover all conceivable features, subfeatures
and subcases of the ConditionExpression syntax - to serve as a test for a
future support for this feature in Alternator (see issue #5053).

As usual, all these tests pass on AWS DynamoDB, but because we haven't yet
implemented this feature in Alternator, all but one xfail on Alternator.

Refs #5053.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20191229143556.24002-1-nyh@scylladb.com>
2020-01-03 15:48:20 +02:00
Nadav Har'El
aad5eeab51 alternator: better error messages when Alternator port is taken
If Alternator is requested to be enabled on a specific port but the port is
already taken, the boot fails as expected - but the error log is confusing;
It currently looks something like this:

WARN  2019-12-24 11:22:57,303 [shard 0] alternator-server - Failed to set up Alternator HTTP server on 0.0.0.0 port 8000, TLS port 8043: std::system_error (error system:98, posix_listen failed for address 0.0.0.0:8000: Address already in use)
... (many more messages about the server shutting down)
INFO  2019-12-24 11:22:58,008 [shard 0] init - Startup failed: std::system_error (error system:98, posix_listen failed for address 0.0.0.0:8000: Address already in use)

There are two problems here. First, the "WARN" should really be an "ERROR",
because it causes the server to be shut down and the user must see this error.
Second, the final line in the log, something the user is likely to see first,
contains only the ultimate cause for the exception (an address already in use)
but not the information what this address was needed for.

This patch solves both issues, and the log now looks like:

ERROR 2019-12-24 14:00:54,496 [shard 0] alternator-server - Failed to set up Alterna
tor HTTP server on 0.0.0.0 port 8000, TLS port 8043: std::system_error (error system
:98, posix_listen failed for address 0.0.0.0:8000: Address already in use)
...
INFO  2019-12-24 14:00:55,056 [shard 0] init - Startup failed: std::_Nested_exception<std::runtime_error> (Failed to set up Alternator HTTP server on 0.0.0.0 port 8000, TLS port 8043): std::system_error (error system:98, posix_listen failed for address 0.0.0.0:8000: Address already in use)

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20191224124127.7093-1-nyh@scylladb.com>
2020-01-03 15:48:20 +02:00
Nadav Har'El
1f64a3bbc9 alternator: error on unsupported ReturnValues option
We don't support yet the ReturnValues option on PutItem, UpdateItem or
DeleteItem operations (see issue #5053), but if a user tries to use such
an option anyway, we silently ignore this option. It's better to fail,
reporting the unsupported option.

In this patch we check the ReturnValues option and if it is anything but
the supported default ("NONE"), we report an error.

Also added a test to confirm this fix. The test verifies that "NONE" is
allowed, and something which is unsupported (e.g., "DOG") is not ignored
but rather causes an error.

Refs #5053.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20191216193310.20060-1-nyh@scylladb.com>
2020-01-03 15:48:20 +02:00
Rafael Ávila de Espíndola
dc93228b66 reloc: Turn the default flags into common flags
These are flags we always want to enable. In particular, we want them
to be used by the bots, but the bots run this script with
--configure-flags, so they were being discarded.

We put the user option later so that they can override the common
options.

Fixes #5505

Reviewed-by: Benny Halevy <bhalevy@scylladb.com>
Reviewed-by: Takuya ASADA <syuu@scylladb.com>
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-01-03 15:48:20 +02:00
Rafael Ávila de Espíndola
d4dfb6ff84 build-id: Handle the binary having multiple PT_NOTE headers
There is no requirement that all notes be placed in a single
PT_NOTE. It looks like recent lld's actually put each section in its
own PT_NOTE.

This change looks for build-id in all PT_NOTE headers.

Fixes #5525

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Reviewed-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20191227000311.421843-1-espindola@scylladb.com>
2020-01-03 15:48:20 +02:00
Avi Kivity
1e9237d814 dist: redhat: use parallel compression for rpm payload
rpm compression uses xz, which is painfully slow. Adjust the
compression settings to run on all threads.

The xz utility documentation suggests that 0 threads is
equivalent to all CPUs, but apparently the library interface
(which rpmbuild uses) doesn't think the same way.

Message-Id: <20200101141544.1054176-1-avi@scylladb.com>
2020-01-03 15:48:20 +02:00
Nadav Har'El
de1171181c user defined types: fix support for case-sensitive type names
In the current code, support for case-sensitive (quoted) user-defined type
names is broken. For example, a test doing:

    CREATE TYPE "PHone" (country_code int, number text)
    CREATE TABLE cf (pk blob, pn "PHone", PRIMARY KEY (pk))

Fails - the first line creates the type with the case-sensitive name PHone,
but the second line wrongly ends up looking for the lowercased name phone,
and fails with an exception "Unknown type ks.phone".

The problem is in cql3_type_name_impl. This class is used to convert a
type object into its proper CQL syntax - for example frozen<list<int>>.
The problem is that for a user-defined type, we forgot to quote its name
if not lowercase, and the result is wrong CQL; For example, a list of
PHone will be written as list<PHone> - but this is wrong because the CQL
parser, when it sees this expression, lowercases the unquoted type name
PHone and it becomes just phone. It should be list<"PHone">, not list<PHone>.

The solution is for cql3_type_name_impl to use for a user-defined type
its get_name_as_cql_string() method instead of get_name_as_string().

get_name_as_cql_string() is a new method which prints the name of the
user type as it should be in a CQL expression, i.e., quoted if necessary.

The bug in the above test was apparently caused when our code serialized
the type name to disk as the string PHone (without any quoting), and then
later deserialized it using the CQL type parser, which converted it into
a lowercase phone. With this patch, the type's name is serialized as
"PHone", with the quotes, and deserialized properly as the type PHone.
While the extra quotes may seem excessive, they are necessary for the
correct CQL type expression - remember that the type expression may be
significantly more complex, e.g., frozen<list<"PHone">> and all of this,
including the quotes, is necessary for our parser to be able to translate
this string back into a type object.

This patch may cause breakage to existing databases which used case-
sensitive user-defined types, but I argue that these use cases were
already broken (as demonstrated by this test) so we won't break anything
that actually worked before.

Fixes #5544

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200101160805.15847-1-nyh@scylladb.com>
2020-01-03 15:48:20 +02:00
Pavel Emelyanov
34f8762c4d storage_service: Drop _update_jobs
This field is write-only.
Leftover from 83ffae1 (storage_service: Drop block_until_update_pending_ranges_finished)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20191226091210.20966-1-xemul@scylladb.com>
2020-01-03 15:48:20 +02:00
Pavel Emelyanov
f2b20e7083 cache_hitrate_calculator: Do not reinvent the peering_sharded_service
The class in question wants to run its own instances on different
shards, for this sake it keeps reference on sharded self to call
invoke_on() on. There's a handy peering_sharded_service<> in seastar
for the same, using it makes the code nicer and shorter.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20191226112401.23960-1-xemul@scylladb.com>
2020-01-03 15:48:19 +02:00
Rafael Ávila de Espíndola
bbed9cac35 cql3: move function creation to a .cc file
We had a lot of code in a .hh file, that while using templeates, was
only used from creating functions during startup.

This moves it to a new .cc file.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200101002158.246736-1-espindola@scylladb.com>
2020-01-03 15:48:19 +02:00
Benny Halevy
c0883407fe scripts: Add cpp-name-format: pretty printer
Pretty-print cpp-names, useful for deciphering complex backtraces.

For example, the following line:
    service::storage_proxy::init_messaging_service()::{lambda(seastar::rpc::client_info const&, seastar::rpc::opt_time_point, std::vector<frozen_mutation, std::allocator<frozen_mutation> >, db::consistency_level, std::optional<tracing::trace_info>)#1}::operator()(seastar::rpc::client_info const&, seastar::rpc::opt_time_point, std::vector<frozen_mutation, std::allocator<frozen_mutation> >, db::consistency_level, std::optional<tracing::trace_info>) const at /local/home/bhalevy/dev/scylla/service/storage_proxy.cc:4360

Is formatted as:
    service::storage_proxy::init_messaging_service()::{
      lambda(
        seastar::rpc::client_info const&,
        seastar::rpc::opt_time_point,
        std::vector<
          frozen_mutation,
          std::allocator<frozen_mutation>
        >,
        db::consistency_level,
        std::optional<tracing::trace_info>
      )#1
    }::operator()(
      seastar::rpc::client_info const&,
      seastar::rpc::opt_time_point,
      std::vector<
        frozen_mutation,
        std::allocator<frozen_mutation>
      >,
      db::consistency_level,
      std::optional<tracing::trace_info>
    ) const at /local/home/bhalevy/dev/scylla/service/storage_proxy.cc:4360

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20191226142212.37260-1-bhalevy@scylladb.com>
2020-01-01 12:08:12 +02:00