Commit Graph

33608 Commits

Author SHA1 Message Date
Anna Stuchlik
d1f7cc99bc doc: fix the external links to the ScyllaDB University lesson about TTL
Closes #11876
2022-11-02 15:05:43 +02:00
Nadav Har'El
59fa8fe903 Merge 'doc: add the information about AArch64 support to Requirements' from Anna Stuchlik
Fix https://github.com/scylladb/scylla-doc-issues/issues/864

This PR:
- updates the introduction to add information about AArch64 and rewrite the content.
- replaces "Scylla" with "ScyllaDB".

Closes #11778

* github.com:scylladb/scylladb:
  Update docs/getting-started/system-requirements.rst
  doc: fix the link to the OS Support page
  doc: replace Scylla with ScyllaDB
  doc: update the info about supported architecture and rewrite the introduction
2022-11-02 11:18:20 +02:00
Anna Stuchlik
ea799ad8fd Update docs/getting-started/system-requirements.rst
Co-authored-by: Tzach Livyatan <tzach.livyatan@gmail.com>
2022-11-02 09:56:56 +01:00
guy9
097a65df9f adding top banner to the Docs website with a link to the ScyllaDB University fall LIVE event
Closes #11873
2022-11-02 10:20:40 +02:00
Nadav Har'El
b9d88a3601 cql/pytest: add reproducer for timestamp column validation issue
This patch adds a reproducing test for issue #11588, which is still open
so the test is expected to fail on Scylla ("xfail), and passes on Cassandra.

The test shows that Scylla allows an out-of-range value to be written to
timestamp column, but then it can't be read back.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #11864
2022-11-01 08:11:01 +02:00
Botond Dénes
dc46bfa783 Merge 'Prepare repair for task manager integration' from Aleksandra Martyniuk
The PR prepares repair for task manager integration:
- Creates repair_module
- Keeps repair_module in repair_service
- Moves tracker methods to repair_module
- Changes UUID to task_id in repair module

Closes #11851

* github.com:scylladb/scylladb:
  repair: check shutdown with abort source in repair module
  repair: use generic module gate for repair module operations
  repair: move tracker to repair module
  repair: move next_repair_command to repair_module
  repair: generate repair id in repair module
  repair: keep shard number in repair_uniq_id
  repair: change UUID to task_id
  repair: add task_manager::module to repair_service
  repair: create repair module and task
2022-11-01 08:05:14 +02:00
Aleksandra Martyniuk
f2fe586f03 repair: check shutdown with abort source in repair module
In repair module the shutdown can be checked using abort_source.
Thus, we can get rid of shutdown flag.
2022-10-31 10:57:29 +01:00
Aleksandra Martyniuk
2d878cc9b5 repair: use generic module gate for repair module operations
Repair module uses a gate to prevent starting new tasks on shutdown.
Generic module's gate serves the same purpose, thus we can
use it also in repair specific context.
2022-10-31 10:56:36 +01:00
Aleksandra Martyniuk
4aae7e9026 repair: move tracker to repair module
Since both tracker and repair_module serve similar purpose,
it is confusing where we should seek for methods connected to them.
Thus, to make it more transparent, tracker class is deleted and all
its attributes and methods are moved to repair_module.
2022-10-31 10:55:36 +01:00
Aleksandra Martyniuk
a5c05dcb60 repair: move next_repair_command to repair_module
Number of the repair operation was counted both with
next_repair_command from tracer and sequence number
from task_manager::module.

To get rid of redundancy next_repair_command was deleted and all
methods using its value were moved to repair_module.
2022-10-31 10:54:39 +01:00
Aleksandra Martyniuk
c81260fb8b repair: generate repair id in repair module
repair_uniq_id for repair task can be generated in repair module
and accessed from the task.
2022-10-31 10:54:24 +01:00
Aleksandra Martyniuk
6432a26ccf repair: keep shard number in repair_uniq_id
Execution shard is one of the traits specific to repair tasks.
Child task should freely access shard id of its parent. Thus,
the shard number is kept in a repair_uniq_id struct.
2022-10-31 10:41:17 +01:00
guy9
276ec377c0 removed broken roadmap link
Closes #11854
2022-10-31 11:33:03 +02:00
Aleksandra Martyniuk
e2c7c1495d repair: change UUID to task_id
Change type of repair id from utils::UUID to task_id to distinguish
them from ids of other entities.
2022-10-31 10:07:08 +01:00
Aleksandra Martyniuk
dc80af33bc repair: add task_manager::module to repair_service
repair_service keeps a shared pointer to repair_module.
2022-10-31 10:04:50 +01:00
Aleksandra Martyniuk
576277384a repair: create repair module and task
Create repair_task_impl and repair_module inheriting from respectively
task manager task_impl and module to integrate repair operations with
task manager.
2022-10-31 10:04:48 +01:00
Takuya ASADA
159bc7c7ea install-dependencies.sh: use binary distributions of PIP package
We currently avoid compiling C code in "pip3 install scylla-driver", but
we actually providing portable binary distributions of the package,
so we should use it by "pip3 install --only-binary=:all: scylla-driver".
The binary distribution contains dependency libraries, so we won't have
problem loading it on relocatable python3.

Closes #11852
2022-10-31 10:38:36 +02:00
Botond Dénes
139fbb466e Merge 'Task manager extension' from Aleksandra Martyniuk
The PR adds changes to task manager that allow more convenient integration with modules.

Introduced changes:
- adds internal flag in task::impl that allows user to filter too specific tasks
- renames `parent_data` to more appropriate name `task_info`
- creates `tasks/types.hh` which allows using some types connected with task manager without the necessity to include whole task manager
- adds more flexible version of `make_task` method

Closes #11821

* github.com:scylladb/scylladb:
  tasks: add alternative make_task method
  tasks: rename parent_data to task_info and move it
  tasks: move task_id to tasks/types.hh
  tasks: add internal flag for task_manager::task::impl
2022-10-31 09:57:10 +02:00
Botond Dénes
2c021affd1 Merge 'storage_service, repair: use per-shard abort_source' from Benny Halevy
Prevent copying shared_ptr across shards
in do_sync_data_using_repair by allocating
a shared_ptr<abort_source> per shard in
node_ops_meta_data and respectively in node_ops_info.

Fixes #11826

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #11827

* github.com:scylladb/scylladb:
  repair: use sharded abort_source to abort repair_info
  repair: node_ops_info: add start and stop methods
  storage_service: node_ops_abort_thread: abort all node ops on shutdown
  storage_service: node_ops_abort_thread: co_return only after printing log message
  storage_service: node_ops_meta_data: add start and stop methods
  repair: node_ops_info: prevent accidental copy
2022-10-31 09:43:34 +02:00
Tenghuan He
e0948ba199 Add directory change instruction
Add directory change instruction while building scylla

Closes #11717
2022-10-30 23:53:02 +02:00
Pavel Emelyanov
477e0c967a scylla-gdb: Evaluate LSA object sizes dynamically
The lsa-segment command tries to walk LSA segment objects by decoding
their descriptors and (!) object sizes as well. Some objects in LSA have
dynamic sizes, i.e. those depending on the object contents. The script
tries to drill down the object internals to get this size, but bad news
is that nowadays there are many dynamic objects that are not covered.
Once stepped upon unsupported object, scylla-gdb likely stops because
the "next" descriptor happens to be in the middle of the object and its
parsing throws.

This patch fixes this by taking advantage of the virtual size() call of
the migrate_fn_type all LSA objects are linked with (indirectly). It
gets the migrator object, the LSA object itself and calls

  ((migrate_fn_type*)<migrator_ptr>)->size((const void*)<object_ptr>)

with gdb. The evaluated value is the live dynamic size of the object.

fixes: #11792
refs: #2455

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #11847
2022-10-28 14:11:30 +03:00
Botond Dénes
74c9aa3a3f Merge 'removenode: allow specifying nodes to ignore using host_id' from Benny Halevy
Currently, when specifying nodes to ignore for replace or removenode,
we support specifying them only using their ip address.

As discussed in https://github.com/scylladb/scylladb/issues/11839 for removenode,
we intentionally require the host uuid for specifying the node to remove,
so the nodes to ignore (that are also done, otherwise we need not ignore them),
should be consistent with that and be specified using their host_id.

The series extends the apis and allows either the nodes ip address or their host_id
to be specified, for backward compatibility.

We should deprecate the ip address method over time and convert the tests and management
software to use the ignored nodes' host_id:s instead.

Closes #11841

* github.com:scylladb/scylladb:
  api: doc: remove_node: improve summary
  api, service: storage_service: removenode: allow passing ignore_nodes as uuid:s
  storage_service: get_ignore_dead_nodes_for_replace: use tm.parse_host_id_and_endpoint
  locator: token_metadata: add parse_host_id_and_endpoint
  api: storage_service: remove_node: validate host_id
2022-10-28 13:35:04 +03:00
Benny Halevy
335a8cc362 api: doc: remove_node: improve summary
The current summary of the operation is obscure.
It refers to a token in the ring and the endpoint associated with it,
while the operation uses a host_id to identify a whole node.

Instead, clarify the summary to refer to a node in the cluster,
consistent with the description for the host_id parameter.
Also, describe the effect the call has on the data the removed node
logically owned.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-28 07:52:37 +03:00
Benny Halevy
9ef2631ec2 api, service: storage_service: removenode: allow passing ignore_nodes as uuid:s
Currently the api is inconsistent: requiring a uuid for the
host_id of the node to be removed, while the ignored nodes list
is given as comma-separated ip addresses.

Instead, support identifying the ignored_nodes either
by their host_id (uuid) or ip address.

Also, require all ignore_nodes to be of the same kind:
either UUIDs or ip addresses, as a mix of the 2 is likely
indicating a user error.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-28 07:49:03 +03:00
Benny Halevy
40cd685371 storage_service: get_ignore_dead_nodes_for_replace: use tm.parse_host_id_and_endpoint
Allow specifying the dead node to ignore either as host_id
or ip address.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-28 07:38:13 +03:00
Benny Halevy
b74807cb8a locator: token_metadata: add parse_host_id_and_endpoint
To be used for specifying nodes either by their
host_id or ip address and using the token_metadata
to resolve the mapping.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-28 07:38:13 +03:00
Benny Halevy
340a5a0c94 api: storage_service: remove_node: validate host_id
The node to be removed must be identified by its host_id.
Validate that at the api layer and pass the parsed host_id
down to storage_service::removenode.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-28 07:38:13 +03:00
Takuya ASADA
464b5de99b scylla_setup: allow symlink to --disks option
Currently, --disks options does not allow symlinks such as
/dev/disk/by-uuid/* or /dev/disk/azure/*.

To allow using them, is_unused_disk() should resolve symlink to
realpath, before evaluating the disk path.

Fixes #11634

Closes #11646
2022-10-28 07:24:11 +03:00
Botond Dénes
b744036840 Merge 'scylla_util.py: on sysconfig_parser, don't use double quote when it's possible' from Takuya ASADA
It seems like distribution original sysconfig files does not use double
quote to set the parameter when the value does not contain space.
Adding function to detect spaces in the value, don't usedouble quote
when it not detected.

Fixes #9149

Closes #9153

* github.com:scylladb/scylladb:
  scylla_util.py: adding unescape for sysconfig_parser
  scylla_util.py: on sysconfig_parser, don't use double quote when it's possible
2022-10-28 07:19:13 +03:00
Benny Halevy
44e1058f63 docs: nodetool/removenode: fix host_id in examples
removenode host_id must specify the host ID as a UUID,
not an ip address.

Fixes #11839

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #11840
2022-10-27 14:29:36 +03:00
Benny Halevy
0ea8250e83 repair: use sharded abort_source to abort repair_info
Currently we use a single shared_ptr<abort_source>
that can't be copied across shards.

Instead, use a sharded<abort_source> in node_ops_info so that each
repair_info instance will use an (optional) abort_source*
on its own shard.

Added respective start and stop methodsm plus a local_abort_source
getter to get the shard-local abort_source (if available).

Fixes #11826

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-27 12:18:30 +03:00
Benny Halevy
88f993e5ed repair: node_ops_info: add start and stop methods
Prepare for adding a sharded<abort_source> member.

Wire start/stop in storage_service::node_ops_meta_data.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-27 12:18:30 +03:00
Benny Halevy
c2f384093d storage_service: node_ops_abort_thread: abort all node ops on shutdown
A later patch adds a sharded<abort_source> to node_ops_info.
On shutdown, we must orderly stop it, so use node_ops_abort_thread
shutdown path (where node_ops_singal_abort is called will a nullopt)
to abort (and stop) all outstanding node_ops by passing
a null_uuid to node_ops_abort, and let it iterate over all
node ops to abort and stop them.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-27 12:14:06 +03:00
Benny Halevy
0efd290378 storage_service: node_ops_abort_thread: co_return only after printing log message
Currently the function co_returns if (!uuid_opt)
so the log info message indicating it's stopped
is not printed.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-27 12:14:03 +03:00
Benny Halevy
47e4761b4e storage_service: node_ops_meta_data: add start and stop methods
Prepare for starting and stopping repair node_ops_info

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-27 12:14:03 +03:00
Benny Halevy
5c25066ea7 repair: node_ops_info: prevent accidental copy
Delete node_ops_info copy and move constructors before
we add a sharded<abort_source> member for the per-shard repairs
in the next patch.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-10-27 12:14:03 +03:00
Takuya ASADA
cd6030d5df scylla_util.py: adding unescape for sysconfig_parser
Even we have __escape() for escaping " middle of the value to writing
sysconfig file, we didn't unescape for reading from sysconfig file.
So adding __unescape() and call it on get().
2022-10-27 16:39:47 +09:00
Takuya ASADA
de57433bcf scylla_util.py: on sysconfig_parser, don't use double quote when it's possible
It seems like distribution original sysconfig files does not use double
quote to set the parameter when the value does not contain space.
Adding function to detect spaces in the value, don't usedouble quote
when it not detected.

Fixes #9149
2022-10-27 16:36:27 +09:00
Aleksandra Martyniuk
6494de9bb0 tasks: add alternative make_task method
Task manager tasks should be created with make_task method since
it properly sets information about child-parent relationship
between tasks. Though, sometimes we may want to keep additional
task data in classes inheriting from task_manager::task::impl.
Doing it with existing make_task method makes it impossible since
implementation objects are created internally.

The commit adds a new make_task that allows to provide a task
implementation pointer created by caller. All the fields except
for the one connected with children and parent should be set before.
2022-10-26 14:01:05 +02:00
Aleksandra Martyniuk
10d11a7baf tasks: rename parent_data to task_info and move it
parent_data struct contains info that is common	for each task,
not only in parent-child relationship context. To use it this way
without confusion, its name is changed to task_info.

In order to be able to widely and comfortably use task_info,
it is moved from tasks/task_manager.hh to tasks/types.hh
and slightly extended.
2022-10-26 14:01:05 +02:00
Aleksandra Martyniuk
9ecc2047ac tasks: move task_id to tasks/types.hh 2022-10-26 14:01:05 +02:00
Aleksandra Martyniuk
e2e8a286cc tasks: add internal flag for task_manager::task::impl
It is convenient to create many different tasks implementations
representing more and more specific parts of the operation in
a module. Presenting all of them through the api makes it cumbersome
for user to navigate and track, though.

Flag internal is added to task_manager::task::impl so that the tasks
could be filtered before they are sent to user.
2022-10-26 14:01:05 +02:00
Pavel Emelyanov
e245780d56 gossiper: Request topology states in shadow round
When doing shadow round for replacement the bootstrapping node needs to
know the dc/rack info about the node it replaces to configure it on
topology. This topology info is later used by e.g. repair service.

fixes: #11829

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #11838
2022-10-25 13:21:20 +03:00
Pavel Emelyanov
64c9359443 storage_proxy: Don't use default-initialized endpoint in get_read_executor()
After calling filter_for_query() the extra_replica to speculate to may
be left default-initialized which is :0 ipv6 address. Later below this
address is used as-is to check if it belongs to the same DC or not which
is not nice, as :0 is not an address of any existing endpoint.

Recent move of dc/rack data onto topology made this place reveal itself
by emitting the internal error due to :0 not being present on the
topology's collection of endpoints. Prior to this move the dc filter
would count :0 as belonging to "default_dc" datacenter which may or may
not match with the dc of the local node.

The fix is to explicitly tell set extra_replica from unset one.

fixes: #11825

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #11833
2022-10-25 09:16:50 +03:00
Takuya ASADA
1a11a38add unified: move unified package contents to sub-directory
On most of the software distribution tar.gz, it has sub-directory to contain
everything, to prevent extract contents to current directory.
We should follow this style on our unified package too.

To do this we need to increment relocatable package version to '3.0'.

Fixes #8349

Closes #8867
2022-10-25 08:58:15 +03:00
Takuya ASADA
a938b009ca scylla_raid_setup: run uuidpath existance check only after mount failed
We added UUID device file existance check on #11399, we expect UUID
device file is created before checking, and we wait for the creation by
"udevadm settle" after "mkfs.xfs".

However, we actually getting error which says UUID device file missing,
it probably means "udevadm settle" doesn't guarantee the device file created,
on some condition.

To avoid the error, use var-lib-scylla.mount to wait for UUID device
file is ready, and run the file existance check when the service is
failed.

Fixes #11617

Closes #11666
2022-10-25 08:54:21 +03:00
Yaniv Kaul
cec21d10ed docs: Fix typo (patch -> batch)
See subject.

Closes #11837
2022-10-25 08:50:44 +03:00
Tomasz Grabiec
687df05e28 db: make_forwardable::reader: Do not emit range_tombstone_change with position past the range
Since the end bound is exclusive, the end position should be
before_key(), not after_key().

Affects only tests, as far as I know, only there we can get an end
bound which is a clustering row position.

Would cause failures once row cache is switched to v2 representation
because of violated assumptions about positions.

Introduced in 76ee3f029c

Closes #11823
2022-10-24 17:06:52 +03:00
Avi Kivity
9e34779c53 Update seastar submodule
* seastar 601e0776c0...f32ed00954 (28):
  > Merge 'treewide: more fmt 9 adjustments' from Avi Kivity
  > rpc: Remove nested class friend declaration from connection
  > reactor: advance the head pointer in batch
  > Add git submodule instructions to HACKING.md, resolves #541
  > dns: Handle TCP mode connect failure
  > future: s/make_exception_ptr/std::make_exception_ptr/
  > reactor: implement read_some(fd, buffer, len) in io_uring
  > reactor: remove unneeded "protected"
  > Merge 'reactor: support more network ops in io_uring backend' from Kefu Chai
  > reactor: Indentation fix after previous patch
  > io: Remove --max-io-requests concept
  > future: add concept constraints to handle_exception()
  > future: improve the doxygen document
  > aio_general_context: flush: provide 1 second grace for retries
  > reactor: destroy_scheduling_group: make sure scheduling_group is valid
  > reactor: pass a plain pointer to io_uring_wait_cqes()
  > gate: add move ctor and move assignment operator for gate
  > reactor: drop stale comment
  > reactor_config: update stale doc comments
  > test: alloc_test: Actually prevent dead allocation elimination
  > util/closeable: hold _obj with reference_wrapper<>
  > memory: Fix off-by-one in large allocation detection
  > util/closeable: add move ctor for deferred_stop
  > reactor: Remove some unused friend declarations
  > core/sharded.hh: tweak on comment for better readability
  > Merge 'fmt 9 ostream fix' from longlene
  > program_options: allow configure switch-stytle option programmatically
  > inet_address: Add helper to check for address being lo/any

Closes #11814
2022-10-21 21:30:07 +03:00
Botond Dénes
4aa0b16852 Merge 'distributed_loader: detect highest generation before populating column families' from Benny Halevy
We should scan all sstables in the table directory and its
subdirectories to determine the highest sstable version and generation
before using it for creating new sstables (via reshard or reshape).

Otherwise, the generations of new sstables created when populating staging (via reshard or reshape) may collide with generations in the base directory, leading to https://github.com/scylladb/scylladb/issues/11789

Refs scylladb/scylladb#11789
Fixes scylladb/scylladb#11793

Closes #11795

* github.com:scylladb/scylladb:
  distributed_loader: populate_column_family: reindent
  distributed_loader: coroutinize populate_column_family
  distributed_loader: table_population_metadata: start: reindent
  distributed_loader: table_population_metadata: coroutinize start_subdir
  distributed_loader: table_population_metadata: start_subdir: reindent
  distributed_loader: pre-load all sstables metadata for table before populating it
2022-10-21 14:07:51 +03:00