Commit Graph

28917 Commits

Author SHA1 Message Date
Avi Kivity
a19d00ef9b dist: scylla_raid_setup: mount XFS with online discard
Online discard asks the disk to erase flash memory cells as soon
as files are deleted. This gives the disk more freedom to choose
where to place new files, so it improves performance.

On older kernel versions, and on really bad disks, this can reduce
performance so we add an option to disable it.

Since fstrim is pointless when online discard is enabled, we
don't configure it if online discard is selected.

I tested it on an AWS i3.large instance, the flag showd up in
`mount` after configuration.

Closes #9608
2021-11-15 14:16:08 +02:00
Avi Kivity
c17101604f Merge 'Revert "scylla_util.py: return bool value on systemd_unit.is_active()"' from Takuya ASADA
On scylla_unit.py, we provide `systemd_unit.is_active()` to return `systemctl is-active` output.
When we introduced systemd_unit class, we just returned `systemctl is-active` output as string, but we changed the return value to bool after that (2545d7fd43).
This was because `if unit.is_active():` always becomes True even it returns "failed" or "inactive", to avoid such scripting bug.
However, probably this was mistake.
Because systemd unit state is not 2 state, like "start" / "stop", there are many state.

And we already using multiple unit state ("activating", "failed", "inactive", "active") in our Cloud image login prompt:
https://github.com/scylladb/scylla-machine-image/blob/next/common/scylla_login#L135
After we merged 2545d7fd43, the login prompt is broken, because it does not return string as script expected (https://github.com/scylladb/scylla-machine-image/issues/241).

I think we should revert 2545d7fd43, it should return exactly same value as `systemctl is-active` says.

Fixes #9627
Fixes scylladb/scylla-machine-image#241

Closes #9628

* github.com:scylladb/scylla:
  scylla_ntp_setup: use string in systemd_unit.is_active()
  Revert "scylla_util.py: return bool value on systemd_unit.is_active()"
2021-11-15 13:56:28 +02:00
Takuya ASADA
279fabe9b4 scylla_ntp_setup: use string in systemd_unit.is_active()
Since we reverted 2545d7fd43, we need to
use string instead of bool value.
2021-11-15 19:50:31 +09:00
Takuya ASADA
d646673705 Revert "scylla_util.py: return bool value on systemd_unit.is_active()"
This reverts commit 2545d7fd43.

Fixes #9627
Fixes scylladb/scylla-machine-image#241
2021-11-15 19:50:31 +09:00
Pavel Emelyanov
4e86936850 redis: Remove stop_server deferred action from main
Commit 3f56c49a9e put redis into protocol_servers list of storage
service. Since then there's no need in explicit stop_server call
on shutdown -- the protocol_servers thing will do it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20211109154259.1196-1-xemul@scylladb.com>
2021-11-15 11:58:44 +02:00
Avi Kivity
4d7a013e94 sstables: mx: writer: make large partition stats accounting branch-free
It is bad form to introduce branches just for statistics, since branches
can be expensive (even when perfectly predictable, they consume branch
history resources). Switch to simple addition instead; this should be
not cause any cache misses since we already touch other statistics
earlier.

The inputs are already boolean, but cast them to boolean just so it
is clear we're adding 0/1, not a count.

Closes #9626
2021-11-15 11:28:48 +02:00
Benny Halevy
9d4262e264 protocol_server: add per-protocol is_server_running method
Change b0a2a9771f broke
the generic api implementation of
is_native_transport_running that relied on
the addresses list being empty agter the server is stopped.

To fix that, this change introduces a pure virtual method:
protocol_server::is_server_running that can be implemented
by each derived class.

Test: unit(dev)
DTest: nodetool_additional_test.py:TestNodetool.binary_test

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20211114135248.588798-1-bhalevy@scylladb.com>
2021-11-14 16:01:31 +02:00
Avi Kivity
c9b8b84411 build: replace yum with dnf
dnf has replaced yum on Fedora and CentOS. On modern versions of Fedora,
you have to install an extra package to get the old name working, so
avoid that inconvenience and use dnf directly.

Closes #9622
2021-11-14 14:41:47 +02:00
Michael Livshin
a7511cf600 system keyspace: record partitions with too many rows
Add "rows" field to system.large_partitions.  Add partitions to the
table when they are too large or have too many rows.

Fixes #9506

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>

Closes #9577
2021-11-14 14:25:18 +02:00
Avi Kivity
98ec98ba36 Update seastar submodule
* seastar 04c6787b35...f8a038a0a2 (1):
  > http: disable Nagle's algorithm for the http server

Fixes #9619.
2021-11-14 13:21:06 +02:00
Avi Kivity
6cb3caaf39 Update seastar submodule
* seastar a189cdc45...04c6787b3 (12):
  > Convert std::result_of to std::invoke_result
  > Merge "IO queue full-duplex mode" from Pavel E
  > Merge "Report bytes/ops for R and W separately" from Pavel E
  > websocket: override std::exception::what() correctly
  > tests: websocket_test: remove unused lambda capture
  > Merge "Improve IO classes preemption" from Pavel E
  > Revert "Merge "Improve IO classes preemption" from Pavel E"
  > Merge "Add skeleton implementation of a WebSocket server" from Piotr S
  > Merge "Improve IO classes preemption" from Pavel E
  > io_queue: Add starvation time metrics (per-class)
  > Revert "Merge "Add skeleton implementation of a WebSocket server" from Piotr S"
  > Merge "Add skeleton implementation of a WebSocket server" from Piotr S
2021-11-13 11:56:28 +02:00
Piotr Sarna
cc544ba117 service: coroutinize client_state.cc
No functional changes, but makes the code shorter and gets rid
of a few allocations.
Coroutinizing has_column_family_access is deliberately skipped and
commented, since some callers expect this function to throw instead
of returning an exceptional future.

Message-Id: <958848a1eeeef490b162d2d2b805c8a14fc9082b.1636704996.git.sarna@scylladb.com>
2021-11-12 21:52:29 +02:00
Tomasz Grabiec
4e3b54d9fe Merge "Teach scylla-gdb.py duplex IO queues" from Pavel Emelyanov
Fresh seastar has duplex IO queues (and some more goodies). The
former one needs respective changes in scylla-gdb.py

* xemul/br-gdb-duplex-ioqueues:
  scylla-gdb: Support new fair_{queue|group}s layout
  scylla-gdb: Add boost::container::small_vector wrapper
  scylla-gdb: Fix indentation aft^w before next patch
2021-11-12 19:43:22 +01:00
Pavel Emelyanov
123286d5cd database: Remove infinite_bound_range_deletion bits
Have been unused for quite a while already

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20211112150837.24125-1-xemul@scylladb.com>
2021-11-12 19:40:17 +01:00
Pavel Emelyanov
5877b84a1a range_streamer: Remove stream_plan from
The streamer creates stream_plan "on demand" and doesnt use the on-board one

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20211112180335.27831-1-xemul@scylladb.com>
2021-11-12 19:38:45 +01:00
Pavel Emelyanov
29892af828 scylla-gdb: Support new fair_{queue|group}s layout
In the recent seastar io_queues carry several fair_queues on board,
so do the io_groups. The queues are in boost small_vector, the groups
are in a vector of unique_ptrs. This patch adds this knowledge to
scylla-gdb script.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-12 16:16:25 +03:00
Pavel Emelyanov
c032794556 scylla-gdb: Add boost::container::small_vector wrapper
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-12 16:15:51 +03:00
Pavel Emelyanov
b321cccaad scylla-gdb: Fix indentation aft^w before next patch
The upcoming seastar update will have fair_groups and fair_queues to
become arrays. Thus scylla-gdb will need to iterate over them with
some sort of loop. This patch makes the queue/group prining indentation
to match this future loop body and prepares the loop variables while
at it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-12 16:11:59 +03:00
Gleb Natapov
123ece611b lwt: co-routinize accept_proposal
Message-Id: <20211111163942.121827-4-gleb@scylladb.com>
2021-11-11 22:13:26 +02:00
Gleb Natapov
588768f4af lwt: co-routinize prepare_ballot
Message-Id: <20211111163942.121827-3-gleb@scylladb.com>
2021-11-11 22:13:26 +02:00
Gleb Natapov
61b2e41a23 lwt: co-routinize begin_and_repair_paxos
Message-Id: <20211111163942.121827-2-gleb@scylladb.com>
2021-11-11 22:13:26 +02:00
Avi Kivity
f74b258928 Merge "Add the system.config virtual table (updateable)" from Pavel E
"
Scylla can be configured via a bunch of config files plus
a bunch of commandline options. Collecting these altogether
can be challenging.

The proposed table solves a big portion of this by dupming
the db::config contents as a table. For convenience (and,
maybe, to facilitate Benny's CLI) it's possible to update
the 'value' column of the table with CQL request.

There exists a PR with a table that exports loglevels in a
form of a table. The updating technique used in this set
is applicable to that table as well.

tests: compilation(dev, release, debug), unit(debug)
"

* 'br-db-config-virtual-table-3' of https://github.com/xemul/scylla:
  tests: Unit test for system.config virtual table
  system_keyspace: Table with config options
  code: Push db::config down to virtual tables
  storage_proxy: Propagate virtual table exceptions messages
  table: Virtual writer hook (mutation applier)
  table: Rewrap table::apply()
  table: Mark virtual reader branch with unlikely
  utils: Add config_src::source_name() method
  utils: Ability to set_value(sstring) for an option
  utils: Internal change of config option
  utils: Mark some config_file methods noexcept
2021-11-11 22:13:26 +02:00
Yaron Kaikov
060a91431d dist/docker/debian/build_docker.sh: debian version fix for rc releases
When building a docker we relay on `VERSION` value from
`SCYLLA-VERSION-GEN` . For `rc` releases only there is a different
between the configured version (X.X.rcX) and the actualy debian package
we generate (X.X~rcX)

Using a similar solution as i did in dcb10374a5

Fixes: #9616

Closes #9617
2021-11-11 22:13:26 +02:00
Pavel Emelyanov
e6ef5e7e43 tests: Unit test for system.config virtual table
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-11 16:39:34 +03:00
Pavel Emelyanov
4a70e0aa57 system_keyspace: Table with config options
A config option value is reported as 'text' type and contains
a string as it would looks like in json config.

The table is UPDATE-able. Only the 'value' columnt can be set
and the value accepted must be string. It will be converted into
the option type automatically, however in current implementation
is't not 100% precise -- conversion is lexicographical cast which
only works for simple types. However, liveupdate-able values are
only of those types, so it works in supported cases.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-11 16:39:34 +03:00
Pavel Emelyanov
947e4c9a10 code: Push db::config down to virtual tables
The db::config reference is available on the database, which
can be get from the virtual_table itself. The problem is that
it's a const refernece, while system.config will be updateable
and will need non-const reference.

Adding non-const get_config() on the database looks wrong. The
database shouldn't be used as config provider, even the const
one.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-11 16:39:34 +03:00
Pavel Emelyanov
1ea301ad07 storage_proxy: Propagate virtual table exceptions messages
The intention is to return some meaningful info to the CQL caller
if a virtual table update fails. Unfortunately the "generic" error
reporting in CQL is not extremely flexible, so the best option
seems to report regular write failre with custom message in it.

For now this only works for virtual table errors.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-11 16:39:34 +03:00
Pavel Emelyanov
5aefc48e28 table: Virtual writer hook (mutation applier)
Symmetrically to virtual reader one, add the virtual writer
callback on a table that will be in charge of applying the
provided mutation.

If a virtual table doesn't override this apply method the
dedicated exception is thrown. Next patch will catch it and
propagate back to caller, so it's a new exception type, not
existing/std one.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-11 16:39:34 +03:00
Pavel Emelyanov
80460f66fc table: Rewrap table::apply()
The main motivation is to have future returning apply (to be used
by next patches). As a side effect -- indentation fix and private
dirty_memory_region_group() method.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-11 16:39:34 +03:00
Pavel Emelyanov
c3d15c3e18 table: Mark virtual reader branch with unlikely
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-11 15:15:05 +03:00
Pavel Emelyanov
b3fee616ea utils: Add config_src::source_name() method
To get a human-readable string from abstract source type.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-11 15:15:05 +03:00
Pavel Emelyanov
d513034ca4 utils: Ability to set_value(sstring) for an option
There soon will appear an updateable system.config table that
will push sstrings into names_value-s. Prepare for this change
by adding the respective .set_value() call. Since the update
only works for LiveUpdate-able options, and inability to do it
can be propagated back to the caller make this method return
true/false whether the update took place or not.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-11 15:15:05 +03:00
Pavel Emelyanov
c226c0a149 utils: Internal change of config option
When a named_value is .set_value()-d the caller may specify the reason
for this change. If not specified it's set to None, but None means
"it was there by default and was't changed" so it's a bit of a lie.

Add an explicit Internal reason. It's actually used by the directories
thing that update all directories according to --workdir option.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-11 15:15:05 +03:00
Pavel Emelyanov
2959ebf393 utils: Mark some config_file methods noexcept
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-11 15:15:05 +03:00
Botond Dénes
b58403fb63 Merge "Flatten database drain" from Pavel E
"
Draining the database is now scattered across the do_drain()
method of the storage_service. Also it tells shutdown drain
from API drain.

This set packs this logic into the database::drain() method.

tests: unit(dev), start-stop-drain(dev)
"

* 'br-database-drain' of https://github.com/xemul/scylla:
  database, storage_service: Pack database::drain() method
  storage_service: Shuffle drain sequence
  storage_service, database: Move flush-on-drain code
  storage_service: Remove bool from do_drain
2021-11-11 08:19:35 +02:00
Tomasz Grabiec
a084c8c10f Merge "raft fixes for bugs found by randomized nemesis testing" from Gleb
The series fixes issues:

server may use the wrong configuration after applying a remote snapshot, causing a split-brain situation

assertion ins raft::server_impl::notify_waiters()

snapshot transfer to a server removed from the configuration should be aborted

cluster may become stuck when a follower takes a snapshot after an accepted entry that the leader didn't learn about

* scylla-dev/random-test-fixes-v2:
  raft: rename rpc_configuration to configuration in fsm output
  raft: test: test case for the issue #9552
  raft: fix matching of a snapshotted log on a follower
  raft: abort snapshot transfer to a server that was removed from the configuration
  raft: fix race between snapshot application and committing of new entries
  raft: test: add test for correct last configuration index calculation during snapshot application
  raft: do not maintain _last_conf_idx and _prev_conf_idx past snapshot index
  raft: correctly truncate the log in a persistence module during snapshot application
2021-11-10 20:36:53 +01:00
Avi Kivity
d949202615 Update tools/java submodule (PyYAML dependency removal)
* tools/java fd10821045...cb6c1d07a7 (1):
  > dist: remove unneeded dependency to PyYAML
2021-11-10 14:16:01 +02:00
Raphael S. Carvalho
49863ab11c tests: sstable_compaction_test: Fix test compaction_with_fully_expired_table
column_family_for_tests was missing the schema which contained the
gc_grace_seconds used by the test.

Fixes #8872.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20211109163440.75592-1-raphaelsc@scylladb.com>
2021-11-09 19:21:57 +02:00
Michał Radwański
eff392073c memtable: fix gcc function argument evaluation order induced use after move
clang evaluates function arguments from left to right, while gcc does so
in reverse. Therefore, this code can be correct on clang and incorrect
on gcc:
```
f(x.sth(), std::move(x))
```

This patch fixes one such instance of this bug, in memtable.cc.

Fixes #9605.

Closes #9606
2021-11-09 19:21:57 +02:00
Avi Kivity
d2e02ea7aa Merge " Abstract table for compaction layer with table_state" from Raphael
"
table_state is being introduced for compaction subsystem, to remove table dependency
from compaction interface, fix layer violations, and also make unit testing
easier as table_state is an abstraction that can be implemented even with no
actual table backing it.

In this series, compaction strategy interfaces are switching to table_state,
and eventually, we'll make compact_sstables() switch to it too. The idea is
that no compaction code will directly reference a table object, but only work
with the abstraction instead. So compaction subdirectory can stop
including database.hh altogether, which is a great step forward.
"

* 'table_state_v5' of https://github.com/raphaelsc/scylla:
  sstable_compaction_test: switch to table_state
  compaction: stop including database.hh for compaction_strategy
  compaction: switch to table_state in estimated_pending_compactions()
  compaction: switch to table_state in compaction_strategy::get_major_compaction_job()
  compaction: switch to table_state in compaction_strategy::get_sstables_for_compaction()
  DTCS: reduce table dependency for task estimation
  LCS: reduce table dependency for task estimation
  table: Implement table_state
  compaction: make table param of get_fully_expired_sstables() const
  compaction_manager: make table param of has_table_ongoing_compaction() const
  Introduce table_state
2021-11-09 19:21:57 +02:00
Pavel Emelyanov
2005b4c330 Merge branch 'move_disable_compaction_to_manager/v6' from Raphael S. Carvalho
Move run_with_compaction_disabled() into compaction manager

run_with_compaction_disabled() living in table is a layer violation as the
logic of disabling compaction for a table T clearly belongs to manager
and table shouldn't be aware of such implementation details.
This makes things less error prone too as there's no longer a need for
coordination between table and manager.
Manager now takes all the responsibility.

* 'move_disable_compaction_to_manager/v6' of https://github.com/raphaelsc/scylla:
  compaction: move run_with_compaction_disabled() from table into compaction_manager
  compaction_manager: switch to coroutine in compaction_manager::remove()
  compaction_manager: add struct for per table compaction state
  compaction_manager: wire stop_ongoing_compactions() into remove()
  compaction_manager: introduce stop_ongoing_compactions() for a table
  compaction_manager: prevent compaction from being postponed when stopping tasks
  compaction_manager: extract "stop tasks" from stop_ongoing_compactions() into new function
2021-11-09 19:21:56 +02:00
Pavel Emelyanov
43f6a13a30 database, storage_service: Pack database::drain() method
The storage_service::do_drain() now ends up with shutting down
compaction manager, flushing CFs and shutting down commitlog.
All three belong to the database and deserve being packed into
a single database::drain() method.

A note -- these steps are cross-shard synchronized, but database
already has a barrier for that.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-09 19:17:38 +03:00
Pavel Emelyanov
906cac0f86 storage_service: Shuffle drain sequence
Right now the draining sequence is

 - stop transport (protocol servers, gossiper, streaming)
 - shutdown tracing
 - shutdown compaction manager
 - flush CFs
 - drain batchlog manager
 - stop migration manager
 - shutdown commitlog

This violates the layering -- both batchlog and migration managers
are higher-level services than the database, so they should be
shutdown/drained before it, i.e. -- before shutting down compaction
manager and flushing all CFs.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-09 19:13:56 +03:00
Pavel Emelyanov
82509c9e74 storage_service, database: Move flush-on-drain code
Flushing all CFs on shutdown is now fully managed in storage service
and it looks weird. Some better place for it seems to be the database
itself.

Moving the flushing code also imples moving the drain_progress thing
and patching the relevant API call.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-09 19:11:49 +03:00
Pavel Emelyanov
aba475fe1d storage_service: Remove bool from do_drain
The do_drain() today tells shutdown drain from API drain. The reason
is that compaction manager subscribes on the main's abort signal and
drains itself early. Thus, on regular drain it needs this extra kick
that would crash if called from shutdown drain.

This differentiation should sit in the compaction manager itself.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-09 19:10:13 +03:00
Raphael S. Carvalho
df4bce03ae sstable_compaction_test: switch to table_state
Let's make compaction tests switch to table_state. All disabled ones
can now be reenabled.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2021-11-09 11:35:45 -03:00
Raphael S. Carvalho
bb5a8682f3 compaction: stop including database.hh for compaction_strategy
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2021-11-09 11:29:47 -03:00
Raphael S. Carvalho
e2f6a47999 compaction: switch to table_state in estimated_pending_compactions()
Last method in compaction_strategy using table. From now on,
compaction strategy no longer works directly with table.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2021-11-09 11:25:28 -03:00
Raphael S. Carvalho
93ae9225f7 compaction: switch to table_state in compaction_strategy::get_major_compaction_job()
From now on, get_major_compaction_job() will use table_state instead of
a plain reference to table.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2021-11-09 11:25:22 -03:00
Raphael S. Carvalho
d881310b52 compaction: switch to table_state in compaction_strategy::get_sstables_for_compaction()
From now on, get_sstables_for_compaction() will use table_state.
With table_state, we avoid layer violations like strategy using
manager and also makes testing easier.

Compaction unit tests were temporarily disabled to avoid a giant
commit which is hard to parse.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2021-11-09 10:52:14 -03:00