Commit Graph

34625 Commits

Author SHA1 Message Date
Kamil Braun
2bfe85ce9b db: system_keyspace: improve system.raft_snapshot_config schema
Remove the `ip_addr` column which was not used. IP addresses are not
part of Raft configuration now and they can change dynamically.

Swap the `server_id` and `disposition` columns in the clustering key, so
when querying the configuration, we first obtain all servers with the
current disposition and then all servers with the previous disposition
(note that a server may appear both in current and previous).
2023-01-17 12:28:00 +01:00
Kamil Braun
c3ed82e5fb service: storage_service: better error handling in decommission
Improve the error handling in `decommission` in case `leave_group0`
fails, informing the user what they should do (i.e. call `removenode` to
get rid of the group 0 member), and allowing decommission to finish; it
does not make sense to let the node continue to run after it leaves the
token ring. (And I'm guessing it's also not safe. Or maybe impossible.)
2023-01-17 12:28:00 +01:00
Kamil Braun
beb0eee007 service: storage_service: fix indentation in removenode 2023-01-17 12:28:00 +01:00
Kamil Braun
aba33dd352 service: storage_service: make removenode work for group 0 members which are not token ring members
Due to failures we might end up in a situation where we have a group 0
member which is not a token ring member: a decommission/removenode
which failed after leaving/removing a node from the token ring but
before leaving / removing a node from group 0.

There was no way to get rid of such a group 0 member. A node that left
the token ring must not be allowed to run further (or it can cause data
loss, data resurrection and maybe other fun stuff), so we can't run
decommission a second time (even if we tried, it would just say that
"we're not a member of the token ring" and abort). And `removenode`
would also not work, because it proceeds only if the node requested to
be removed is a member of the token ring.

We modify `removenode` so it can run in this situation and remove the
group 0 member. The parts of `removenode` related to token ring
modification are now conditioned on whether the node was a member of the
token ring. The final `remove_from_group0` step is in its own branch. Some
minor refactors were necessary. Some log messages were also modified so
it's easier to understand which messages correspond the "token movement"
part of the procedure.

The `make_nonvoter` step happens only if token ring removal happens,
otherwise we can skip directly to `remove_from_group0`.

We also move `remove_from_group0` outside the "try...catch",
fixing #11723. The "node ops" part of the procedure is related strictly
to token ring movement, so it makes sense for `remove_from_group0` to
happen outside.

Indentation is broken in this commit for easier reviewability, fixed in
the following commit.

Fixes: #11723
2023-01-17 12:28:00 +01:00
Kamil Braun
ec2cd29e42 service/raft: raft_group0: perform read_barrier in wait_for_raft
Right now wait_for_raft is called before performing group 0
configuration changes. We want to also call it before checking for
membership, for that it's desirable to have the most recent information,
hence call read_barrier. In the existing use cases it's not strictly
necessary, but it doesn't hurt.
2023-01-17 12:28:00 +01:00
Kamil Braun
db734cd74f service: storage_service: make leaving node a non-voter before removing it from group 0 in decommission/removenode
removenode currently works roughly like this:
1. stream/repair data so it ends up on new replica sets (calculated
   without the node we want to remove)
2. remove the node from the token ring
3. remove the node from group 0 configuration.

If the procedure fails before after step 2 but before step 3 finishes,
we're in trouble: the cluster is left with an additional voting group 0
member, which reduces group 0's availability, and there is no way to
remove this member because `removenode` no longer considers it to be
part of the cluster (it consults the token ring to decide).

Improve this failure scenario by including a new step at the beginning:
make the node a non-voter in group 0 configuration. Then, even if we
fail after removing the node from the token ring but before removing it
from group 0, we'll only be left with a non-voter which doesn't reduce
availability.

We make a similar change for `decommission`: between `unbootstrap()` (which
streams data) and `leave_ring()` (which removes our tokens from the
ring), become a non-voter. The difference here is that we don't become a
non-voter at the beginning, but only after streaming/repair. In
`removenode` it's desirable to make the node a non-voter as soon as
possible because it's already dead. In decommission it may be desirable
for us to remain a voter if we fail during streaming because we're still
alive and functional in that case.

In a later commit we'll also make it possible to retry `removenode` to
remove a node that is only a group 0 member and not a token ring member.
2023-01-17 12:28:00 +01:00
Kamil Braun
1eee349a17 test: test_raft_upgrade: remove test_raft_upgrade_with_node_remove
The test would create a scenario where one node was down while the others
started the Raft upgrade procedure. The procedure would get stuck, but
it was possible to `removenode` the downed node using one of the alive
nodes, which would unblock the Raft upgrade procedure.

This worked because:
1. the upgrade procedure starts by ensuring that all peers can be
   contacted,
2. `removenode` starts by removing the node from the token ring.

After removing the node from the token ring, the upgrade procedure
becomes able to contact all peers (the peers set no longer contains the
down node). At the end, after removing the node from the token ring,
`removenode` would actually get stuck for a while, waiting for the
upgrade procedure to finish before removing the peer from group 0.
After the upgrade procedure finished, `removenode` would also finish.
(so: first the upgrade procedure waited for removenode, then removenode
waited for the upgrade procedure).

We want to modify the `removenode` procedure and include a new step
before removing the node from the token ring: making the node a
non-voter. The purpose is to improve the possible failure scenarios.
Previously, if the `removenode` procedure failed after removing the node
from the token ring but before removing it from group 0, the cluster
would contain a 'garbage' group 0 member which is a voter - reducing
group 0's availability. If the node is made a non-voter first, then this
failure will not be as big of a problem, because the leftover group 0
member will be a non-voter.

However, to correctly perform group 0 operations including making
someone a nonvoter, we must first wait for the Raft upgrade procedure to
finish (or at least wait until everyone joins group 0). Therefore by
including this 'make the node a non-voter' step at the beginning of
`removenode`, we make it impossible to remove a token ring member in the
middle of the upgrade procedure, on which the test case relied. The test
case would get stuck waiting for the `removenode` operation to finish,
which would never finish because it would wait for the upgrade procedure
to finish, which would not finish because of the dead peer.

We remove the test case; it was "lucky" to pass in the first place. We
have a dedicated mechanism for handling dead peers during Raft upgrade
procedure: the manual Raft group 0 RECOVERY procedure. There are other
test cases in this file which are using that procedure.
2023-01-17 12:28:00 +01:00
Kamil Braun
4f0801406e service/raft: raft_group0: link to Raft docs where appropriate
Resolve some TODOs.
2023-01-17 12:28:00 +01:00
Kamil Braun
2befbaa341 service/raft: raft_group0: more logging
Make the logs in leave_group0 consistent with logs in
remove_from_group0.
2023-01-17 12:28:00 +01:00
Kamil Braun
77dc1c4c70 service/raft: raft_group0: separate function for checking and waiting for Raft
leave_group0 and remove_from_group0 functions both start with the
following steps:
- if Raft is disabled or in RECOVERY mode, print a simple log message
  and abort
- if Raft cluster feature flag is not yet enabled, print a complex log
  message and abort
- wait for Raft upgrade procedure to finish
- then perform the actual group 0 reconfiguration.

Refactor these preparation steps to a separate function,
`wait_for_raft`. This reduces code duplication; the function will also
be used in more operations later (becoming a nonvoter or turning another
server into a nonvoter).

We also change the API so that the preparation function is called from
outside by the caller before they call the reconfiguration function.
This is because in later commits, some of the call sites (mainly
`removenode`) will want to check explicitly whether Raft is enabled and
wait for Raft's availabilty, then perform a sequence of steps related
to group 0 configuration depending on the result.

Also add a private function `raft_upgrade_complete()` which we use to
assert that Raft is ready to be used.
2023-01-17 12:27:58 +01:00
Gleb Natapov' via ScyllaDB development
15ebd59071 lwt: upgrade stored mutations to the latest schema during prepare
Currently they are upgraded during learn on a replica. The are two
problems with this.  First the column mapping may not exist on a replica
if it missed this particular schema (because it was down for instance)
and the mapping history is not part of the schema. In this case "Failed
to look up column mapping for schema version" will be thrown. Second lwt
request coordinator may not have the schema for the mutation as well
(because it was freed from the registry already) and when a replica
tries to retrieve the schema from the coordinator the retrieval will fail
causing the whole request to fail with "Schema version XXXX not found"

Both of those problems can be fixed by upgrading stored mutations
during prepare on a node it is stored at. To upgrade the mutation its
column mapping is needed and it is guarantied that it will be present
at the node the mutation is stored at since it is pre-request to store
it that the corresponded schema is available. After that the mutation
is processed using latest schema that will be available on all nodes.

Fixes #10770

Message-Id: <Y7/ifraPJghCWTsq@scylladb.com>
2023-01-17 11:14:46 +01:00
Raphael S. Carvalho
f2f839b9cc compaction: LCS: don't reshape all levels if only a single breaks disjointness
LCS reshape is compacting all levels if a single one breaks
disjointness. That's unnecessary work because rewriting that single
level is enough to restore disjointness. If multiple levels break
disjointness, they'll each be reshaped in its own iteration, so
reducing operation time for each step and disk space requirement,
as input files can be released incrementally.
Incremental compaction is not applied to reshape yet, so we need to
avoid "major compaction", to avoid the space overhead.
But space overhead is not the only problem, the inefficiency, when
deciding what to reshape when overlapping is detected, motivated
this patch.

Fixes #12495.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes #12496
2023-01-17 09:55:15 +02:00
Michał Chojnowski
9e17564c70 types: add some missing explicit instantiations
Some functions defined by a template in types.cc are used in other
translation units (via `cql3/untyped_result_set.hh`), but aren't
explicitly instantiated. Therefore their linking can fail, depending
on inlining decisions. (I experienced this when playing with compiler
options).
Fix that.

Closes #12539
2023-01-17 10:46:01 +02:00
Nadav Har'El
5bf94ae220 cql: allow disabling of USING TIMESTAMP sanity checking
As requested by issue #5619, commit 2150c0f7a2
added a sanity check for USING TIMESTAMP - the number specified in the
timestamp must not be more than 3 days into the future (when viewed as
a number of microseconds since the epoch).

This sanity checking helps avoid some annoying client-side bugs and
mis-configurations, but some users genuinely want to use arbitrary
or futuristic-looking timestamps and are hindered by this sanity check
(which Cassandra doesn't have, by the way).

So in this patch we add a new configuration option, restrict_future_timestamp
If set to "true", futuristic timestamps (more than 3 days into the future)
are forbidden. The "true" setting is the default (as has been the case
sinced #5619). Setting this option to "false" will allow using any 64-bit
integer as a timestamp, like is allowed Cassanda (and was allowed in
Scylla prior to #5619.

The error message in the case where a futuristic timestamp is rejected
now mentions the configuration paramter that can be used to disable this
check (this, and the option's name "restrict_*", is similar to other
so-called "safe mode" options).

This patch also includes a test, which works in Scylla and Cassandra,
with either setting of restrict_future_timestamp, checking the right
thing in all these cases (the futuristic timestamp can either be written
and read, or can't be written). I used this test to manually verify that
the new option works, defaults to "true", and when set to "false" Scylla
behaves like Cassandra.

Fixes #12527

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #12537
2023-01-16 23:18:56 +02:00
Kefu Chai
114f30016a main: use std::shift_left() to consume tool name
for better readability.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #12536
2023-01-16 21:01:34 +02:00
Nadav Har'El
feef3f9dda test/cql-pytest: test more than one restriction on same clustering column
Cassandra refuses a request with more than one relation to the same
clustering column, for example

    DELETE FROM tbl WHERE p = ? and c = ? AND c > ?

complains that

    c cannot be restricted by more than one relation if it includes an Equal

But it produces different error messages for different operators and
even order.

Currently, Scylla doesn't consider such requests an error. Whether or
not we should be compatible with Cassandra here is discussed in
issue #12472. But as long as we do accept these queries, we should be
sure we do the right thing: "WHERE c = 1 AND c > 2" should match
nothing, "WHERE c = 1 AND c > 0" should match the matches of c = 1,
and so on. This patch adds a test for verify that these requests indeed
yield correct results. The test is scylla_only because, as explained
above, Cassandra doesn't support these requests at all.

Refs #12472

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #12498
2023-01-16 20:41:16 +02:00
Kefu Chai
86b451d45c SCYLLA-VERSION-GEN: remove unnecessary bashism
remove unnecessary bashism, so that this script can be interpreted
by a POSIX shell.

/bin/sh is specified in the shebang line. on debian derivatives,
/bin/sh is dash, which is POSIX compliant. but this script is
written in the bash dialect.

before this change, we could run into following build failure
when building the tree on Debian:

[7/904] ./SCYLLA-VERSION-GEN
./SCYLLA-VERSION-GEN: 37: [[: not found

after this change, the build is able to proceed.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #12530
2023-01-16 20:34:01 +02:00
Avi Kivity
0b418fa7cf cql3, transport, tests: remove "unset" from value type system
The CQL binary protocol introduced "unset" values in version 4
of the protocol. Unset values can be bound to variables, which
cause certain CQL fragments to be skipped. For example, the
fragment `SET a = :var` will not change the value of `a` if `:var`
is bound to an unset value.

Unsets, however, are very limited in where they can appear. They
can only appear at the top-level of an expression, and any computation
done with them is invalid. For example, `SET list_column = [3, :var]`
is invalid if `:var` is bound to unset.

This causes the code to be littered with checks for unset, and there
are plenty of tests dedicated to catching unsets. However, a simpler
way is possible - prevent the infiltration of unsets at the point of
entry (when evaluating a bind variable expression), and introduce
guards to check for the few cases where unsets are allowed.

This is what this long patch does. It performs the following:

(general)

1. unset is removed from the possible values of cql3::raw_value and
   cql3::raw_value_view.

(external->cql3)

2. query_options is fortified with a vector of booleans,
   unset_bind_variable_vector, where each boolean corresponds to a bind
   variable index and is true when it is unset.
3. To avoid churn, two compatiblity structs are introduced:
   cql3::raw_value{,_view}_vector_with_unset, which can be constructed
   from a std::vector<raw_value{,_view/}>, which is what most callers
   have. They can also be constructed with explicit unset vectors, for
   the few cases they are needed.

(cql3->variables)

4. query_options::get_value_at() now throws if the requested bind variable
   is unset. This replaces all the throwing checks in expression evaluation
   and statement execution, which are removed.
5. A new query_options::is_unset() is added for the users that can tolerate
   unset; though it is not used directly.
6. A new cql3::unset_operation_guard class guards against unsets. It accepts
   an expression, and can be queried whether an unset is present. Two
   conditions are checked: the expression must be a singleton bind
   variable, and at runtime it must be bound to an unset value.
7. The modification_statement operations are split into two, via two
   new subclasses of cql3::operation. cql3::operation_no_unset_support
   ignores unsets completely. cql3::operation_skip_if_unset checks if
   an operand is unset (luckily all operations have at most one operand that
   tolerates unset) and applies unset_operation_guard to it.
8. The various sites that accept expressions or operations are modified
   to check for should_skip_operation(). This are the loops around
   operations in update_statement and delete_statement, and the checks
   for unset in attributes (LIMIT and PER PARTITION LIMIT)

(tests)

9. Many unset tests are removed. It's now impossible to enter an
   unset value into the expression evaluation machinery (there's
   just no unset value), so it's impossible to test for it.
10. Other unset tests now have to be invoked via bind variables,
   since there's no way to create an unset cql3::expr::constant.
11. Many tests have their exception message match strings relaxed.
   Since unsets are now checked very early, we don't know the context
   where they happen. It would be possible to reintroduce it (by adding
   a format string parameter to cql3::unset_operation_guard), but it
   seems not to be worth the effort. Usage of unsets is rare, and it is
   explicit (at least with the Python driver, an unset cannot be
   introduced by ommission).

I tried as an alternative to wrap cql3::raw_value{,_view} (that doesn't
recognize unsets) with cql3::maybe_unset_value (that does), but that
caused huge amounts of churn, so I abandoned that in favor of the
current approach.

Closes #12517
2023-01-16 21:10:56 +02:00
Kamil Braun
7510144fba Merge 'Add replace-node-first-boot option' from Benny Halevy
Allow replacing a node given its Host ID rather than its ip address.

This series adds a replace_node_first_boot option to db/config
and makes use of it in storage_service.

The new option takes priority over the legacy replace_address* options.
When the latter are used, a deprecation warning is printed.

Documentation updated respectively.

And a cql unit_test is added.

Ref #12277

Closes #12316

* github.com:scylladb/scylladb:
  docs: document the new replace_node_first_boot option
  dist/docker: support --replace-node-first-boot
  db: config: describe replace_address* options as deprecated
  test: test_topology: test replace using host_id
  test: pylib: ServerInfo: add host_id
  storage_service: get rid of get_replace_address
  storage_service: is_replacing: rely directly on config options
  storage_service: pass replacement_info to run_replace_ops
  storage_service: pass replacement_info to booststrap
  storage_service: join_token_ring: reuse replacement_info.address
  storage_service: replacement_info: add replace address
  init: do not allow cfg.replace_node_first_boot of seed node
  db: config: add replace_node_first_boot option
2023-01-16 15:08:31 +01:00
Michał Sala
bbbe12af43 forward_service: fix timeout support in parallel aggregates
`forward_request` verb carried information about timeouts using
`lowres_clock::time_point` (that came from local steady clock
`seastar::lowres_clock`). The time point was produced on one node and
later compared against other node `lowres_clock`. That behavior
was wrong (`lowres_clock::time_point`s produced with different
`lowres_clock`s cannot be compared) and could lead to delayed or
premature timeout.

To fix this issue, `lowres_clock::time_point` was replaced with
`lowres_system_clock::time_point` in `forward_request` verb.
Representation to which both time point types serialize is the same
(64-bit integer denoting the count of elapsed nanoseconds), so it was
possible to do an in-place switch of those types using logic suggested
by @avikivity:
    - using steady_clock is just broken, so we aren't taking anything
        from users by breaking it further
    - once all nodes are upgraded, it magically starts to work

Closes #12529
2023-01-16 12:08:13 +02:00
Botond Dénes
3d9ab1d9eb Merge 'Get recursive tasks' statuses with task manager api call' from Aleksandra Martyniuk
The PR adds an api call allowing to get the statuses of a given
task and all its descendants.

The parent-child tree is traversed in BFS order and the list of
statuses is returned to user.

Closes #12317

* github.com:scylladb/scylladb:
  test: add test checking recursive task status
  api: get task statuses recursively
  api: change retrieve_status signature
2023-01-16 11:44:50 +02:00
Tzach Livyatan
073f0f00c6 Add Scylla Summit 2023 in the top banner
Closes #12519
2023-01-16 08:05:20 +02:00
Avi Kivity
5a07641b95 Update python3 submodule (license file fix)
* tools/python3 548e860...279b6c1 (1):
  > create-relocatable-package: s/pyhton3-libs/python3-libs/
2023-01-15 17:59:27 +02:00
Benny Halevy
de3142e540 docs: document the new replace_node_first_boot option
And mention that replacing a node using the legacy
replace_addr* options is deprecated.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-01-13 18:41:44 +02:00
Benny Halevy
d4f1563369 dist/docker: support --replace-node-first-boot
And mention that replace_address_first_boot is deprecated

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-01-13 18:36:09 +02:00
Benny Halevy
1577aa8098 db: config: describe replace_address* options as deprecated
The replace_address options are still supported
But mention in their description that they are now deprecated
and the user should use replace_node_first_boot instead.

While at it fix a typo in ignore_dead_nodes_for_replace

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-01-13 18:36:09 +02:00
Benny Halevy
90faeedb77 test: test_topology: test replace using host_id
Add test cases exercising the --replace-node-first-boot option
by replacing nodes using their host_id rather
than ip address.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-01-13 18:36:09 +02:00
Benny Halevy
7d0d9e28f1 test: pylib: ServerInfo: add host_id
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-01-13 18:36:07 +02:00
Benny Halevy
db2b76beb5 storage_service: get rid of get_replace_address
It is unused now.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-01-13 18:34:29 +02:00
Benny Halevy
17f70e4619 storage_service: is_replacing: rely directly on config options
Rather than on get_replace_address, before we remove the latter.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-01-13 18:34:29 +02:00
Benny Halevy
7282d58d11 storage_service: pass replacement_info to run_replace_ops
So it won't need to call get_replace_address.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-01-13 18:34:09 +02:00
Benny Halevy
08598e4f64 storage_service: pass replacement_info to booststrap
So it won't need to call get_replace_address.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-01-13 18:30:48 +02:00
Benny Halevy
b863f7a75f storage_service: join_token_ring: reuse replacement_info.address
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-01-13 18:30:48 +02:00
Benny Halevy
add2f209b8 storage_service: replacement_info: add replace address
Populate replacement_info.address in prepare_replacement_info
as a first step towards getting rid of get_replace_address().

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-01-13 18:30:48 +02:00
Benny Halevy
75c8a5addc init: do not allow cfg.replace_node_first_boot of seed node
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-01-13 18:30:48 +02:00
Benny Halevy
32e79185d4 db: config: add replace_node_first_boot option
For replacing a node given its (now unique) Host ID.

The existing options for replace_address*
will be deprecated in the following patches
and eventually we will stop supporting them.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-01-13 18:30:48 +02:00
Tomasz Grabiec
abc43f97c9 Merge 'Simplify some Raft tables' from Kamil Braun
Rename `system.raft_config` to `system.raft_snapshot_config` to make it clearer
what the table stores.

Remove the `my_server_id` partition key column from
`system.raft_snapshot_config` and a corresponding column from
`system.raft_snapshots` which would store the Raft server ID of the local node.
It's unnecessary, all servers running on a given node in different groups will
use the same ID - the Raft ID of the node which is equal to its Host ID. There
will be no multiple servers running in a single Raft group on the same node.

Closes #12513

* github.com:scylladb/scylladb:
  db: system_keyspace: remove (my_)server_id column from RAFT_SNAPSHOTS and RAFT_SNAPSHOT_CONFIG
  db: system_keyspace: rename 'raft_config' to 'raft_snapshot_config'
2023-01-13 00:23:21 +01:00
Botond Dénes
4e41e7531c docs/dev/debugging.md: recommend open-coredump.sh for opening coredumps
Leave the guide for manual opening in though, the script might not work
in all cases.
Also update the version example, we changed how development versions
look like.

Closes #12511
2023-01-12 19:30:59 +02:00
Botond Dénes
ab8171ffd5 open-coredump.sh: handle dev versions
Like: 5.2.0~dev, which really means master. Don't try to checkout
branch-5.2 in this case, it doesn't exist yet, checkout master instead.

Closes #12510
2023-01-12 19:28:58 +02:00
Kamil Braun
be390285b6 db: system_keyspace: remove (my_)server_id column from RAFT_SNAPSHOTS and RAFT_SNAPSHOT_CONFIG
A single node will run a single Raft server in any given Raft group,
so this column is not necessary.
2023-01-12 16:48:50 +01:00
Kamil Braun
bed555d1e5 db: system_keyspace: rename 'raft_config' to 'raft_snapshot_config'
Make it clear that the table stores the snapshot configuration, which is
not necessarily the currently operating configuration (the last one
appended to the log).

In the future we plan to have a separate virtual table for showing the
currently operating configuration, perhaps we will call it
`system.raft_config`.
2023-01-12 16:21:26 +01:00
Botond Dénes
f87e3993ef Merge 'configure.py: a bunch of clean-up changes' from Michał Chojnowski
The planned integration of cross-module optimizations in scylladb/scylladb-enterprise requires several changes to `configure.py`. To minimize the divergence between the `configure.py`s of both repositories, this series upstreams some of these changes to scylladb/scylladb.

The changes mostly remove dead code and fix some traps for the unaware.

Closes #12431

* github.com:scylladb/scylladb:
  configure.py: prevent deduplication of seastar compile options
  configure.py: rename clang_inline_threshold()
  configure.py: rework the seastar_cflags variable
  configure.py: hoist the pkg_config() call for seastar-testing.pc
  configure.py: unify the libs variable for tests and non-tests
  configure.py: fix indentation
  configure.py: remove a stale code path for .a artifacts
2023-01-12 16:40:02 +02:00
Wojciech Mitros
082bfea187 rust: use depfile and Cargo.lock to avoid building rust when unnecessary
Currently, we call cargo build every time we build scylla, even
when no rust files have been changed.
This is avoided by adding a depfile to the ninja rule for the rust
library.
The rust file is generated by default during cargo build,
but it uses the full paths of all depenencies that it includes,
and we use relative paths. This is fixed by specifying
CARGO_BUILD_DEP_INFO_BASEDIR='.', which makes it so the current
path is subtracted from all generated paths.
Instead of using 'always' when specifying when to run the cargo
build, a dependency on Cargo.lock is added additionally to the
depfile. As a result, the rust files are recompiled not only
when the source files included in the depfile are modified,
but also when some rust dependency is updated.
Cargo may put an old cached file as a result of the build even
when the Cargo.lock was recently updated. Because of that, the
the build result may be older than the Cargo.lock file even
if the build was just performed. This may cause ninja to rebuilt
the file every following time. To avoid this, we 'touch' the
build result, so that its last modification time is up to date.
Because the dependency on Cargo.lock was added, the new command
for the build does not modify it. Instead, the developer must
update it when modifying the dependencies - the docs are updated
to reflect that.

Closes #12489

Fixes #12508
2023-01-12 14:44:11 +02:00
Kefu Chai
77baea2add docs/architecture: fix typo of SyllaDB
s/SyllaDB/ScyllaDB/

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #12505
2023-01-12 12:25:53 +02:00
Michał Chojnowski
1ff4abef4a configure.py: prevent deduplication of seastar compile options
In its infinite wisdom, CMake deduplicates the options passed
to `target_compile_options`, making it impossible to pass options which require
duplication, such as -mllvm.
Passing e.g.
`-mllvm;-pgso=false;-mllvm;-inline-threshold=2500` invokes the compiler
`-mllvm -pgso=false -inline-threshold=2500`, breaking the options.

As a workaround, CMake added the `SHELL:` syntax, which makes it possible to
pass the list of options not as a CMake list, but as a shell-quoted string.
Let's use it, so we can pass multiple -mllvm options.
2023-01-12 11:24:10 +01:00
Michał Chojnowski
85facefe45 configure.py: rename clang_inline_threshold()
There's a global variable (the CLI argument) with the same name.
Rename one of the two to avoid accidental mixups.
2023-01-12 11:24:10 +01:00
Michał Chojnowski
d9de78f6d3 configure.py: rework the seastar_cflags variable
The name of this variable is misleading. What it really does is pass flags to
static libraries compiled by us, not just to seastar.
We will need this capability to implement cross-artifact optimizations in our
build.
We will also need to pass linker flags, and we will need to vary those flags
depending on the build mode.

This patch splits the seastar_cflags variable into per-mode lib_cflags and
lib_ldflags variables. It shouldn't change the resulting build.ninja for now,
but will be needed by later planned patches.
2023-01-12 11:24:10 +01:00
Michał Chojnowski
ee462a9d3c configure.py: hoist the pkg_config() call for seastar-testing.pc
Put the pkg_config() for seastar-testing.pc in the same area as the call
for seastar.pc, outside of the loop.
This is a cosmetic change aimed at making following commits cleaner.
2023-01-12 11:24:10 +01:00
Michał Chojnowski
c9aeeeae11 configure.py: unify the libs variable for tests and non-tests
This is a cosmetic change aimed at make following commits in the same area
cleaner.
2023-01-12 11:24:09 +01:00
Michał Chojnowski
10ac881ef1 configure.py: fix indentation
Fix indentation after the preceeding commit.
2023-01-12 11:23:32 +01:00