Commit Graph

45615 Commits

Author SHA1 Message Date
Gleb Natapov
0ca14ef8b7 hints: use host id to send hints
Drop address translation that no longer needed. Templates here are used
temporarily until another user of the function (MV) is converted as
well.
2024-12-02 10:31:12 +02:00
Gleb Natapov
5b9e4c2f07 storage_proxy: remove id_vector_to_addr since it is no longer used
Was needed during transition period only.
2024-12-02 10:31:12 +02:00
Gleb Natapov
6116751e44 db: consistency_level: change is_sufficient_live_nodes to work on host ids
It is called from storage proxy which works on host ids now.
2024-12-02 10:31:12 +02:00
Gleb Natapov
eb3d2307ce replication_strategy: move sanity_check_read_replicas to host id
It is called from storage proxy which works on host ids now.
2024-12-02 10:31:12 +02:00
Gleb Natapov
ccbfabb858 db: consistency_level: move filter_for_query to host id
It is called from storage proxy which works on host ids now.
2024-12-02 10:31:12 +02:00
Gleb Natapov
474b47ed22 database: move hits rates handling to host ids
Hits rates map is now indexed by ip. Change it to be indexed by host id since this is what
storage proxy uses now.
2024-12-02 10:31:12 +02:00
Gleb Natapov
d2cf5ca030 messaging_service: pass host id to connection_dropped handler id available
RPC clients which are host id aware may pass the id to
connection_dropped callback and save the need for translation.
2024-12-02 10:31:12 +02:00
Gleb Natapov
9f7183286a storage_proxy: change batchlog to work on host ids
It was not translated in the first pass.
2024-12-02 10:31:12 +02:00
Gleb Natapov
a1fdc8c847 storage_proxy: change mutation rpcs to send forward and reply addresses as host ids
RPCs from old nodes will still use old format so translation will be
used in this case. The change is backwards compatible thanks to RPC
extensibility.
2024-12-02 10:31:12 +02:00
Gleb Natapov
cd9b349886 migration_manager: move to use host ids instead of ips
Users also amended to pass ids instead of ips.
2024-12-02 10:31:12 +02:00
Gleb Natapov
2f23a21a23 raft: raft_group_registry: do not insert entry into raft address map on incoming message
Raft map is no longer used to send raft messages. We rely on gossiper
address propagation now.
2024-12-02 10:31:12 +02:00
Gleb Natapov
1f302577d0 group0: move transfer_snapshot to use host ids
No need to translate id to ip any longer.
2024-12-02 10:31:12 +02:00
Gleb Natapov
e695cb1054 topology: use gossiper address map instead of raft one in storage service
Also remove forcing of the replacing node to be alive which is not
needed any more since gossiper no longer inhibits replacing nodes from
advertising themselves.
2024-12-02 10:31:12 +02:00
Gleb Natapov
b6425446c6 gossiper: fix indentation after previous patch 2024-12-02 10:31:11 +02:00
Gleb Natapov
a64b079b5c gossiper: drop advertise_myself parameter to gossiper
The parameter was needed when nodes were addressed by IP, so during
replace with the same IP a new node had to "hide" itself from the
cluster to not get accidentally confused with the old node. Now,
when nodes are addressed by host id the situation is impossible.
2024-12-02 10:31:11 +02:00
Gleb Natapov
12937aeb7f storage_proxy: move to addressing nodes by host ids instead of ips
In this rather large path we mode to address nodes in storage proxy by
host ids instead of ips. Some subsystems storage proxy calls to are
not yet converted to host ids, so we translate back and forth when we
interact with them.
2024-12-02 10:31:11 +02:00
Gleb Natapov
b7402af872 locator: topology: add sort_by_proximity function that works on host ids 2024-12-02 10:31:11 +02:00
Gleb Natapov
0882f2024c locator: topology: make topology object always contain local node
Currently the locator::topology object, when created, does not contain
local node, but it is started to be used to access local database. It
sort of work now because there are explicit checks in the code to handle
this special case like in topology::get_location for instance. We do not
want to hack around it and instead rely on an invariant that the local
node is always there. To do that we add local node during
locator::topology creation. There is a catch though. Unlike with IP host
ID is not known during startup. We actually need to read from the
database to know it, so the topology starts with host ID zero and then
it changes once to the real one. This is not a problem though. As long as
the (one node) topology is consistent (_cfg.this_host_id is equal to the
node's id) local access will work.
2024-12-02 10:31:11 +02:00
Gleb Natapov
9cda32af92 locator: put real host id into the replication map for local replication strategy
Local replication strategy returns zero host id in replica set instead
of the real one. It mostly works now because code that translates ids
to ips knows that zero host id is a special one. But we want to use host
ids directly and we need to return real one (or handle zero special case
everywhere).
2024-12-02 10:31:11 +02:00
Gleb Natapov
e7f869591d gossiper: add address map getters 2024-12-02 10:31:11 +02:00
Gleb Natapov
faef04e688 replication_strategy: add host id versions of get_natural_endpoints/get_pending_endpoints/get_endpoints_for_reading functions
Those functions will return host ids instead of ips.
2024-12-02 10:31:11 +02:00
Gleb Natapov
1c5a7826dc storage_service: pass gossip_address_map
It will be used in the following patches.
2024-12-02 10:31:11 +02:00
Gleb Natapov
79358278f2 service: raft: move raft pinger to sending messages by host id
This allows us to drop dependency on raft_address_map from direct_fd_pinger.
2024-12-02 10:31:11 +02:00
Gleb Natapov
0e045181d8 raft_rpc: use host ids to send raft rpcs
Address translation is no longer needed since host id can be used
directly.
2024-12-02 10:31:11 +02:00
Gleb Natapov
2c17fa6370 topology coordinator: drop raft_address_map dependency
raft_address_map is not used by the coordinator code any longer.
2024-12-02 10:31:11 +02:00
Gleb Natapov
aba4ae0ca1 topology coordinator: rename wait_for_ip to wait_for_gossiper and drop raft address map usage
What wait_for_ip is actually does is waiting for a node to appear in the
gossiper since this is when it is added to the raft address map. Drop
the usage of the address map and check the gossiper directly.
2024-12-02 10:31:11 +02:00
Gleb Natapov
414ec6d5bb topology coordinator: get rid of host id to ip translations
Now we have enough functionality in the gossiper and messaging service
to get rid of ip2id function in the topology coordinator. We can use
hos ids directly.
2024-12-02 10:31:11 +02:00
Gleb Natapov
15145c16d1 gossiper: provide wait_alive that works on host ids
We have wait_alive function that gets an array of ip address and wait
for all of them to be alive. Provide similar one that works on host ids.
2024-12-02 10:31:10 +02:00
Gleb Natapov
84c7aa8f48 gossiper: send up notifications by host ids 2024-12-02 10:31:10 +02:00
Gleb Natapov
609cb2dee9 gossiper: send failure detection ping to a host id instead of ip
This way wrong host will not answer it.
2024-12-02 10:31:10 +02:00
Gleb Natapov
c51263d085 messaging_service: add a separate map for clients created with host id available
We want to use different clients to send messages based on ids and ips,
so provide a separate map to hold them.
2024-12-02 10:31:10 +02:00
Gleb Natapov
83cde134d0 messaging_service: add dst host id to CLIEN_ID RPC and send it if provided
If an RPC client creation was triggered by send function that has host
id as a dst send it as part of CLIENT_ID RPC which is always the first
RPC on each connection. If receiver's host id does not match it will
drop the connection.
2024-12-02 10:30:59 +02:00
Gleb Natapov
aa87fecce2 gossiper: add is_alive that works on host_id
The function checks if a node with provided id is alive. If it fails to map id to ip
or there is no state for the ip found the node is considered to be dead.
2024-12-01 12:12:30 +02:00
Gleb Natapov
76aa41dfcf messaging_service: pass gossip_address_map to the mm and introduce send by id functions
The function looks up provided host id in gossip_address_map and throws
unknown_address if the mapping is not available. Otherwise it sends the
message by IP found.
2024-12-01 12:12:30 +02:00
Gleb Natapov
0e264ccba9 gossiper: populate gossip_address_map
Add a non expiring entry into the address map for each host in the
gossiper state and change one to expiring when the state is deleted.
2024-12-01 12:12:30 +02:00
Gleb Natapov
ca2544e57e gossiper: introduce gossip address map
Introduce new address map that will be populated by the gossiper. Create
in during initialization and pass it to the gossiper.
2024-12-01 12:12:29 +02:00
Gleb Natapov
be5caec54e service: make address_map raft independent
We want to start using address map class outside for raft, so lets make
it work on host_id instead of raft::servers_id and move is outside of
raft.
2024-12-01 12:12:29 +02:00
Gleb Natapov
cc1b5aaf51 idl: generate host_id variant of send functions as well
We want to be able to address nodes by host ids. For that lets generate
send functions that gets host_id as a dst parameter.

Changes to raft_rpc are needed because otherwise the compiler cannot
select a correct overload.
2024-12-01 12:12:29 +02:00
Kefu Chai
65949ce607 test: topology_custom: ensure node visibility before keyspace creation
Building upon commit 69b47694, this change addresses a subtle synchronization
weakness in node visibility checks during recovery mode testing.

Previous Approach:
- Waited only for the first node to see its peers
- Insufficient to guarantee full cluster consistency

Current Solution:
1. Implement comprehensive node visibility verification
2. Ensure all nodes mutually recognize each other
3. Prevent potential schema propagation race conditions

Key Improvements:
- Robust cluster state validation before keyspace creation
- Eliminate partial visibility scenarios

Fixes scylladb/scylladb#21724

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21726
2024-11-29 17:13:21 +01:00
Kefu Chai
afeff0a792 docs: explain task status retention and one-time query behavior
Task status information from nodetool commands is not retained permanently:

- Status of completed tasks is only kept for `task_ttl_in_seconds`
- Status is removed after being queried, making it a one-time operation

This behavior is important for users to understand since subsequent
queries for the same completed task will not return any information.
Add documentation to make this clear to users.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21386
2024-11-29 16:36:27 +01:00
Pavel Emelyanov
f2509d90a5 Merge 'mutation: remove unused "#include"s' from Kefu Chai
these unused includes are identified by clang-include-cleaner. after auditing the source files, all of the reports have been confirmed.

please note, because `mutation/mutation.hh` does not include `seastar/coroutine/maybe_yield.hh` anymore, and quite a few source files were relying on this header to bring in the declaration of `maybe_yield()`, we have to include this header in the places where this symbol is used. the same applies to `seastar/core/when_all.hh`.

---

it's a cleanup, hence no need to backport.

Closes scylladb/scylladb#21727

* github.com:scylladb/scylladb:
  .github: add "mutation" to CLEANER_DIR
  mutation: remove unused "#include"s
2024-11-29 13:01:53 +03:00
Kefu Chai
efbf6e5526 .github: add "mutation" to CLEANER_DIR
in order to prevent future inclusion of unused headers, let's include
"mutation" subdirectory to CLEANER_DIR, so that this workflow can
identify the regressions in future.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-11-29 14:01:44 +08:00
Kefu Chai
f436edfa22 mutation: remove unused "#include"s
these unused includes are identified by clang-include-cleaner. after
auditing the source files, all of the reports have been confirmed.

please note, because `mutation/mutation.hh` does not include
`seastar/coroutine/maybe_yield.hh` anymore, and quite a few source
files were relying on this header to bring in the declaration of
`maybe_yield()`, we have to include this header in the places where
this symbol is used. the same applies to `seastar/core/when_all.hh`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-11-29 14:01:44 +08:00
Botond Dénes
055a36ae55 main: dump diagnostics on SIGQUIT
Dump a diagnostics report on each shard when receiving a SIGQUIT. The
report is logged with a dedicated logger, called diagnostics.
The report has multiple parts:
* seastar memory diagnostics, similar to that printed by the scylla
  memory command (from scylla-gdb.py).
* reader concurrency semaphore diagnostics for each semaphore.

Example report:

    INFO  2024-11-27 01:31:55,882 [shard 0:main] diagnostics - Diagnostics dump requested via SIGQUIT:
    Dumping seastar memory diagnostics
    Used memory:   3988M
    Free memory:   58M
    Total memory:  4G
    Hard failures: 0

    LSA
      allocated: 4M
      used:      16
      free:      4G

    Cache:
      total: 1M
      used:  642K
      free:  398K

    Memtables:
     total: 3M
     Regular:
      real dirty: 0B
      virt dirty: 0B
     System:
      real dirty: 3M
      virt dirty: 3M

    Replica:
      Read Concurrency Semaphores:
        user: 0/100, 0B/81M, queued: 0
        streaming: 0/10, 0B/81M, queued: 0
        system: 0/10, 0B/81M, queued: 0
        compaction: 0/unlimited, 0B/unlimited
        view update: 0/50, 0B/40M, queued: 0
      Execution Stages:
        apply stage:
             Total: 0
      Tables - Ongoing Operations:
        Pending writes (top 10):
          0 Total (all)
        Pending reads (top 10):
          0 Total (all)
        Pending streams (top 10):
          0 Total (all)

    Small pools:
    objsz spansz usedobj memory unused wst%
        8     4K     858    16K     9K   58
       10     4K       5     8K     8K   99
       12     4K       5     8K     8K   99
       14     4K       0     0B     0B    0
       16     4K      2k    44K    15K   35
       32     4K      4k   136K    16K   11
       32     4K      8k   280K    24K    8
       32     4K      3k    92K     6K    6
       32     4K      4k   140K    21K   14
       48     4K      3k   180K    25K   14
       48     4K      2k   120K    27K   22
       64     4K      2k   156K    18K   11
       64     4K     19k     1M    11K    0
       80     4K      3k   236K    16K    6
       96     4K      6k   572K    49K    8
      112     4K      2k   276K    72K   25
      128     4K     477    80K    20K   25
      160     4K     194    60K    30K   49
      192     4K      1k   232K    39K   16
      224     4K      2k   468K    15K    3
      256     4K     182   100K    55K   54
      320     8K     349   152K    43K   28
      384     8K     332   288K   164K   56
      448     4K     243   180K    74K   40
      512     4K     256   244K   116K   47
      640    16K     185   192K    76K   39
      768    16K     394   432K   137K   31
      896     8K      54   192K   144K   75
     1024     4K     288   432K   144K   33
     1280    32K      92   256K   140K   54
     1536    32K      11   128K   111K   86
     1792    16K      10   144K   126K   87
     2048     8K     487     1M    90K    8
     2560    64K     113   384K   100K   26
     3072    64K       9   256K   228K   89
     3584    32K       3   288K   277K   96
     4096    16K     129   912K   396K   43
     5120   128K      21   384K   275K   71
     6144   128K       4   512K   486K   94
     7168    64K       3   576K   553K   96
     8192    32K     373     3M    56K    1
    10240    64K       6   832K   770K   92
    12288    64K      17   960K   756K   78
    14336   128K       2     1M     1M   97
    16384    64K      14     1M   992K   81

    Page spans:
    index  size  free  used spans
        0    4K    4K    5M    1k
        1    8K    8K    2M   213
        2   16K   16K    2M   106
        3   32K   64K    6M   200
        4   64K   64K    4M    71
        5  128K  384K 3934M   31k
        6  256K    1M  256K     5
        7  512K  512K  512K     2
        8    1M    2M    0B     2
        9    2M    2M    2M     2
       10    4M    4M    0B     1
       11    8M   16M    0B     2
       12   16M   32M    0B     2
       13   32M    0B   32M     1
       14   64M    0B    0B     0
       15  128M    0B    0B     0
       16  256M    0B    0B     0
       17  512M    0B    0B     0
       18    1G    0B    0B     0
       19    2G    0B    0B     0
       20    4G    0B    0B     0
       21    8G    0B    0B     0
       22   16G    0B    0B     0
       23   32G    0B    0B     0
       24   64G    0B    0B     0
       25  128G    0B    0B     0
       26  256G    0B    0B     0
       27  512G    0B    0B     0
       28    1T    0B    0B     0
       29    2T    0B    0B     0
       30    4T    0B    0B     0
       31    8T    0B    0B     0

    INFO  2024-11-27 01:31:55,882 [shard 0:main] diagnostics - Diagnostics dump requested via SIGQUIT:
    Semaphore user with 0/100 count and 0/84850769 memory resources: user request, dumping permit diagnostics:

    permits	count	memory	table/operation/state

    0	0	0B	total

    Stats:
    permit_based_evictions: 0
    time_based_evictions: 0
    inactive_reads: 0
    total_successful_reads: 0
    total_failed_reads: 0
    total_reads_shed_due_to_overload: 0
    total_reads_killed_due_to_kill_limit: 0
    reads_admitted: 0
    reads_enqueued_for_admission: 0
    reads_enqueued_for_memory: 0
    reads_admitted_immediately: 0
    reads_queued_because_ready_list: 0
    reads_queued_because_need_cpu_permits: 0
    reads_queued_because_memory_resources: 0
    reads_queued_because_count_resources: 0
    reads_queued_with_eviction: 0
    total_permits: 0
    current_permits: 0
    need_cpu_permits: 0
    awaits_permits: 0
    disk_reads: 0
    sstables_read: 0
    INFO  2024-11-27 01:31:55,882 [shard 0:main] diagnostics - Diagnostics dump requested via SIGQUIT:
    Semaphore streaming with 0/10 count and 0/84850769 memory resources: user request, dumping permit diagnostics:

    permits	count	memory	table/operation/state

    0	0	0B	total

    Stats:
    permit_based_evictions: 0
    time_based_evictions: 0
    inactive_reads: 0
    total_successful_reads: 6
    total_failed_reads: 0
    total_reads_shed_due_to_overload: 0
    total_reads_killed_due_to_kill_limit: 0
    reads_admitted: 6
    reads_enqueued_for_admission: 0
    reads_enqueued_for_memory: 0
    reads_admitted_immediately: 6
    reads_queued_because_ready_list: 0
    reads_queued_because_need_cpu_permits: 0
    reads_queued_because_memory_resources: 0
    reads_queued_because_count_resources: 0
    reads_queued_with_eviction: 0
    total_permits: 6
    current_permits: 0
    need_cpu_permits: 0
    awaits_permits: 0
    disk_reads: 0
    sstables_read: 0
    INFO  2024-11-27 01:31:55,882 [shard 0:main] diagnostics - Diagnostics dump requested via SIGQUIT:
    Semaphore compaction with 0/2147483647 count and 0/9223372036854775807 memory resources: user request, dumping permit diagnostics:

    permits	count	memory	table/operation/state

    0	0	0B	total

    Stats:
    permit_based_evictions: 0
    time_based_evictions: 0
    inactive_reads: 0
    total_successful_reads: 0
    total_failed_reads: 0
    total_reads_shed_due_to_overload: 0
    total_reads_killed_due_to_kill_limit: 0
    reads_admitted: 0
    reads_enqueued_for_admission: 0
    reads_enqueued_for_memory: 0
    reads_admitted_immediately: 0
    reads_queued_because_ready_list: 0
    reads_queued_because_need_cpu_permits: 0
    reads_queued_because_memory_resources: 0
    reads_queued_because_count_resources: 0
    reads_queued_with_eviction: 0
    total_permits: 27
    current_permits: 0
    need_cpu_permits: 0
    awaits_permits: 0
    disk_reads: 0
    sstables_read: 0
    INFO  2024-11-27 01:31:55,882 [shard 0:main] diagnostics - Diagnostics dump requested via SIGQUIT:
    Semaphore system with 0/10 count and 0/84850769 memory resources: user request, dumping permit diagnostics:

    permits	count	memory	table/operation/state
    1	0	0B	*.*/view_builder/active

    1	0	0B	total

    Stats:
    permit_based_evictions: 0
    time_based_evictions: 0
    inactive_reads: 0
    total_successful_reads: 234
    total_failed_reads: 0
    total_reads_shed_due_to_overload: 0
    total_reads_killed_due_to_kill_limit: 0
    reads_admitted: 234
    reads_enqueued_for_admission: 154
    reads_enqueued_for_memory: 0
    reads_admitted_immediately: 80
    reads_queued_because_ready_list: 154
    reads_queued_because_need_cpu_permits: 0
    reads_queued_because_memory_resources: 0
    reads_queued_because_count_resources: 0
    reads_queued_with_eviction: 0
    total_permits: 235
    current_permits: 1
    need_cpu_permits: 0
    awaits_permits: 0
    disk_reads: 0
    sstables_read: 0
    INFO  2024-11-27 01:31:55,882 [shard 0:main] diagnostics - Diagnostics dump requested via SIGQUIT:
    Semaphore view_update with 0/50 count and 0/42425384 memory resources: user request, dumping permit diagnostics:

    permits	count	memory	table/operation/state

    0	0	0B	total

    Stats:
    permit_based_evictions: 0
    time_based_evictions: 0
    inactive_reads: 0
    total_successful_reads: 0
    total_failed_reads: 0
    total_reads_shed_due_to_overload: 0
    total_reads_killed_due_to_kill_limit: 0
    reads_admitted: 0
    reads_enqueued_for_admission: 0
    reads_enqueued_for_memory: 0
    reads_admitted_immediately: 0
    reads_queued_because_ready_list: 0
    reads_queued_because_need_cpu_permits: 0
    reads_queued_because_memory_resources: 0
    reads_queued_because_count_resources: 0
    reads_queued_with_eviction: 0
    total_permits: 0
    current_permits: 0
    need_cpu_permits: 0
    awaits_permits: 0
    disk_reads: 0
    sstables_read: 0

Fixes: scylladb/scylladb#7400

Closes scylladb/scylladb#21692
2024-11-28 18:52:29 +02:00
Botond Dénes
ff90a77f5b scylla-sstable: revamp schema sources
Demote --scylla-data-dir and --scylla-yaml-file to schema source
helpers, rather than schema source in themselves. This practically means
that when these options are used, they won't define where the tool will
attempt to load the schema from, they will just be helpers to help locate
the schema, for whichever schema source the tool was instructed to use
(or left to choose).
--scylla-data-dir and --scylla-yaml-file being schema sources were
problematic with encryption at rest and for S3 support (not yet
implemented). With encryption, the tool needs access to the
configuration, so --scylla-yaml-file is often used to provide the path
to the configuration file, which contains encryption configuration,
needed for the tool to decrypt the sstable. Currently, using this option
implies forcing the tool to read the schema from the schema tables,
which is a problematic option for tests -- Scylla might be compacting a
schema sstable and this will make the tool fail to load the schema.
Demoting these options the schema helpers, allows providing them, while
at the same time having the option to use a different schema-source.

To allow the user to force the tool to load the schema from the schema
tables, a new --schema-tables option is added. Similarly, a
--sstable-schema option is introduced to force the tool to load the
schema from the sstable itself.

With this, each 4 schema source now has an option to force the use of
said schema source. There are various helper options to be used along
with these.

The documentation as well as the tests are updated with the changes.
The schema related documentation gets an rather extensive facelift
because it was a bit out-of-date and incomplete.

Fixes: scylladb/scylladb#20534

Closes scylladb/scylladb#21678
2024-11-28 18:36:09 +02:00
Kefu Chai
2c9c654798 build: cmake: Enforce explicit library linkage visibility
This change improves dependency management by explicitly specifying
library linkage visibility in CMake targets.

Previously, some ScyllaDB targets used `target_link_libraries()`
without `PUBLIC` or `PRIVATE` keywords, which resulted in transitive
library dependencies by default. This unintentionally exposed
non-public dependencies to downstream targets.

Changes:
- Always use explicit `PRIVATE` or `PUBLIC` keywords with
  `target_link_libraries()`
- Tighten build dependency tree
- Enforce a more modular linkage model

See: [CMake documentation on library dependencies](https://cmake.org/cmake/help/latest/command/target_link_libraries.html)

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21686
2024-11-28 18:15:23 +02:00
Piotr Smaron
a49ed7074d Update in-memory ks.metadata.init_tablets after ALTER KS
Once e.g. `ALTER KEYSPACE` is performed, all in-memory objects should be updated accordingly, but this is not entirely true for keyspace metadata object. The reason for that is that keyspace metadata are stored in 2 system tables: `system_schema.keyspaces` and `system_schema.scylla_keyspaces`. Up until now the in-memory keyspace metadata object has been updated only with entries from the first table, and missed updates when entries from the 2nd table changed. These entries were e.g. initial tablets or storage options.
This change fixes this oversight by considering both tables when checking if keyspace metadata need to be updated. From the implementation point of view, the change is simple: we're considering `system_schema.scylla_keyspaces` also in `merge_keyspaces()` and if old and new schemas have any differences, we include that when altering ks.

Fixes #20768

Backport: no need, I don't think the issue is severe, atm it seems like it can only influence the tablets number, which should not bring the cluster down nor result in returning bad data, it can mostly influence the speed of the db.

Closes scylladb/scylladb#20852
2024-11-28 13:46:32 +01:00
Nikos Dragazis
6091d5d789 sstables: Fix range of input stream in checksummed file data source
The checksummed file data source uses the chunk size to enforce that the
reads from the underlying file input stream will be aligned at the chunk
boundary. This is necessary so that we can validate the checksum of each
chunk.

However, a mismatch in the numeric types caused a bug where the
underlying file input stream would read a smaller portion of the data
file than expected.

The bug is located in the following lines:

```
auto start = _beg_pos & ~(chunk_size - 1);
auto end = (_end_pos & ~(chunk_size - 1)) + chunk_size;
```

`_beg_pos` and `_end_pos` are `uint64_t`, whereas `chunk_size` is
`uint32_t`. When executing the AND operation, the compiler converts the
right operand from `uint32_t` to `uint64_t`. Since the integer is
unsigned, the four most-significant bytes are filled with zeros, thus
erroneously truncating the corresponding bytes of the position.

Fix the bug by explicitly converting the chunk size to `uint64_t` before
any arithmetic operations. Also, replace the handwritten alignment
implementations with the `align_up()` and `align_down()` helpers.

Finally, restrict the file end position to not exceed the file length.
Since the last chunk can be smaller than the chunk size, it could happen
that the end position exceeds the file length after the round-up. This
is not a bug on its own since `make_file_input_stream()` can accept
lengths that go beyond end-of-file, but still it makes the code more
error prone and should be avoided.

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>

Closes scylladb/scylladb#21665
2024-11-28 12:53:05 +02:00
Dawid Mędrek
7cce9a8f64 db/hints: Prevent dereferencing a null pointer
Before these changes, we dereferenced `app_state` in
`manager::endpoint_downtime_not_bigger_than()` before checking that it's
not a null pointer. We fix that.

Fixes scylladb/scylladb#21699

Closes scylladb/scylladb#21676
2024-11-28 11:31:57 +01:00
Ernest Zaslavsky
4035e0877d s3_tests: Add s3 test to check object re-uploading
Add s3 test to check existing object re-uploading succeeds

Closes scylladb/scylladb#21544
2024-11-28 12:46:59 +03:00