Botond Dénes
055a36ae55
main: dump diagnostics on SIGQUIT
...
Dump a diagnostics report on each shard when receiving a SIGQUIT. The
report is logged with a dedicated logger, called diagnostics.
The report has multiple parts:
* seastar memory diagnostics, similar to that printed by the scylla
memory command (from scylla-gdb.py).
* reader concurrency semaphore diagnostics for each semaphore.
Example report:
INFO 2024-11-27 01:31:55,882 [shard 0:main] diagnostics - Diagnostics dump requested via SIGQUIT:
Dumping seastar memory diagnostics
Used memory: 3988M
Free memory: 58M
Total memory: 4G
Hard failures: 0
LSA
allocated: 4M
used: 16
free: 4G
Cache:
total: 1M
used: 642K
free: 398K
Memtables:
total: 3M
Regular:
real dirty: 0B
virt dirty: 0B
System:
real dirty: 3M
virt dirty: 3M
Replica:
Read Concurrency Semaphores:
user: 0/100, 0B/81M, queued: 0
streaming: 0/10, 0B/81M, queued: 0
system: 0/10, 0B/81M, queued: 0
compaction: 0/unlimited, 0B/unlimited
view update: 0/50, 0B/40M, queued: 0
Execution Stages:
apply stage:
Total: 0
Tables - Ongoing Operations:
Pending writes (top 10):
0 Total (all)
Pending reads (top 10):
0 Total (all)
Pending streams (top 10):
0 Total (all)
Small pools:
objsz spansz usedobj memory unused wst%
8 4K 858 16K 9K 58
10 4K 5 8K 8K 99
12 4K 5 8K 8K 99
14 4K 0 0B 0B 0
16 4K 2k 44K 15K 35
32 4K 4k 136K 16K 11
32 4K 8k 280K 24K 8
32 4K 3k 92K 6K 6
32 4K 4k 140K 21K 14
48 4K 3k 180K 25K 14
48 4K 2k 120K 27K 22
64 4K 2k 156K 18K 11
64 4K 19k 1M 11K 0
80 4K 3k 236K 16K 6
96 4K 6k 572K 49K 8
112 4K 2k 276K 72K 25
128 4K 477 80K 20K 25
160 4K 194 60K 30K 49
192 4K 1k 232K 39K 16
224 4K 2k 468K 15K 3
256 4K 182 100K 55K 54
320 8K 349 152K 43K 28
384 8K 332 288K 164K 56
448 4K 243 180K 74K 40
512 4K 256 244K 116K 47
640 16K 185 192K 76K 39
768 16K 394 432K 137K 31
896 8K 54 192K 144K 75
1024 4K 288 432K 144K 33
1280 32K 92 256K 140K 54
1536 32K 11 128K 111K 86
1792 16K 10 144K 126K 87
2048 8K 487 1M 90K 8
2560 64K 113 384K 100K 26
3072 64K 9 256K 228K 89
3584 32K 3 288K 277K 96
4096 16K 129 912K 396K 43
5120 128K 21 384K 275K 71
6144 128K 4 512K 486K 94
7168 64K 3 576K 553K 96
8192 32K 373 3M 56K 1
10240 64K 6 832K 770K 92
12288 64K 17 960K 756K 78
14336 128K 2 1M 1M 97
16384 64K 14 1M 992K 81
Page spans:
index size free used spans
0 4K 4K 5M 1k
1 8K 8K 2M 213
2 16K 16K 2M 106
3 32K 64K 6M 200
4 64K 64K 4M 71
5 128K 384K 3934M 31k
6 256K 1M 256K 5
7 512K 512K 512K 2
8 1M 2M 0B 2
9 2M 2M 2M 2
10 4M 4M 0B 1
11 8M 16M 0B 2
12 16M 32M 0B 2
13 32M 0B 32M 1
14 64M 0B 0B 0
15 128M 0B 0B 0
16 256M 0B 0B 0
17 512M 0B 0B 0
18 1G 0B 0B 0
19 2G 0B 0B 0
20 4G 0B 0B 0
21 8G 0B 0B 0
22 16G 0B 0B 0
23 32G 0B 0B 0
24 64G 0B 0B 0
25 128G 0B 0B 0
26 256G 0B 0B 0
27 512G 0B 0B 0
28 1T 0B 0B 0
29 2T 0B 0B 0
30 4T 0B 0B 0
31 8T 0B 0B 0
INFO 2024-11-27 01:31:55,882 [shard 0:main] diagnostics - Diagnostics dump requested via SIGQUIT:
Semaphore user with 0/100 count and 0/84850769 memory resources: user request, dumping permit diagnostics:
permits count memory table/operation/state
0 0 0B total
Stats:
permit_based_evictions: 0
time_based_evictions: 0
inactive_reads: 0
total_successful_reads: 0
total_failed_reads: 0
total_reads_shed_due_to_overload: 0
total_reads_killed_due_to_kill_limit: 0
reads_admitted: 0
reads_enqueued_for_admission: 0
reads_enqueued_for_memory: 0
reads_admitted_immediately: 0
reads_queued_because_ready_list: 0
reads_queued_because_need_cpu_permits: 0
reads_queued_because_memory_resources: 0
reads_queued_because_count_resources: 0
reads_queued_with_eviction: 0
total_permits: 0
current_permits: 0
need_cpu_permits: 0
awaits_permits: 0
disk_reads: 0
sstables_read: 0
INFO 2024-11-27 01:31:55,882 [shard 0:main] diagnostics - Diagnostics dump requested via SIGQUIT:
Semaphore streaming with 0/10 count and 0/84850769 memory resources: user request, dumping permit diagnostics:
permits count memory table/operation/state
0 0 0B total
Stats:
permit_based_evictions: 0
time_based_evictions: 0
inactive_reads: 0
total_successful_reads: 6
total_failed_reads: 0
total_reads_shed_due_to_overload: 0
total_reads_killed_due_to_kill_limit: 0
reads_admitted: 6
reads_enqueued_for_admission: 0
reads_enqueued_for_memory: 0
reads_admitted_immediately: 6
reads_queued_because_ready_list: 0
reads_queued_because_need_cpu_permits: 0
reads_queued_because_memory_resources: 0
reads_queued_because_count_resources: 0
reads_queued_with_eviction: 0
total_permits: 6
current_permits: 0
need_cpu_permits: 0
awaits_permits: 0
disk_reads: 0
sstables_read: 0
INFO 2024-11-27 01:31:55,882 [shard 0:main] diagnostics - Diagnostics dump requested via SIGQUIT:
Semaphore compaction with 0/2147483647 count and 0/9223372036854775807 memory resources: user request, dumping permit diagnostics:
permits count memory table/operation/state
0 0 0B total
Stats:
permit_based_evictions: 0
time_based_evictions: 0
inactive_reads: 0
total_successful_reads: 0
total_failed_reads: 0
total_reads_shed_due_to_overload: 0
total_reads_killed_due_to_kill_limit: 0
reads_admitted: 0
reads_enqueued_for_admission: 0
reads_enqueued_for_memory: 0
reads_admitted_immediately: 0
reads_queued_because_ready_list: 0
reads_queued_because_need_cpu_permits: 0
reads_queued_because_memory_resources: 0
reads_queued_because_count_resources: 0
reads_queued_with_eviction: 0
total_permits: 27
current_permits: 0
need_cpu_permits: 0
awaits_permits: 0
disk_reads: 0
sstables_read: 0
INFO 2024-11-27 01:31:55,882 [shard 0:main] diagnostics - Diagnostics dump requested via SIGQUIT:
Semaphore system with 0/10 count and 0/84850769 memory resources: user request, dumping permit diagnostics:
permits count memory table/operation/state
1 0 0B *.*/view_builder/active
1 0 0B total
Stats:
permit_based_evictions: 0
time_based_evictions: 0
inactive_reads: 0
total_successful_reads: 234
total_failed_reads: 0
total_reads_shed_due_to_overload: 0
total_reads_killed_due_to_kill_limit: 0
reads_admitted: 234
reads_enqueued_for_admission: 154
reads_enqueued_for_memory: 0
reads_admitted_immediately: 80
reads_queued_because_ready_list: 154
reads_queued_because_need_cpu_permits: 0
reads_queued_because_memory_resources: 0
reads_queued_because_count_resources: 0
reads_queued_with_eviction: 0
total_permits: 235
current_permits: 1
need_cpu_permits: 0
awaits_permits: 0
disk_reads: 0
sstables_read: 0
INFO 2024-11-27 01:31:55,882 [shard 0:main] diagnostics - Diagnostics dump requested via SIGQUIT:
Semaphore view_update with 0/50 count and 0/42425384 memory resources: user request, dumping permit diagnostics:
permits count memory table/operation/state
0 0 0B total
Stats:
permit_based_evictions: 0
time_based_evictions: 0
inactive_reads: 0
total_successful_reads: 0
total_failed_reads: 0
total_reads_shed_due_to_overload: 0
total_reads_killed_due_to_kill_limit: 0
reads_admitted: 0
reads_enqueued_for_admission: 0
reads_enqueued_for_memory: 0
reads_admitted_immediately: 0
reads_queued_because_ready_list: 0
reads_queued_because_need_cpu_permits: 0
reads_queued_because_memory_resources: 0
reads_queued_because_count_resources: 0
reads_queued_with_eviction: 0
total_permits: 0
current_permits: 0
need_cpu_permits: 0
awaits_permits: 0
disk_reads: 0
sstables_read: 0
Fixes : scylladb/scylladb#7400
Closes scylladb/scylladb#21692
2024-11-28 18:52:29 +02:00
Botond Dénes
e29e836aca
docs/operating-scylla: add a document on diagnostic tools
...
ScyllaDB has wide variety of tools and source of information useful for
diagnosing problems. These are scattered all over the place and although
most of these are documented, there is currently no document listing all
the relevant tools and information sources when it comes to diagnosing a
problem.
This patch adds just that: a document listing the different tools and
information sources, with a brief description of how they can help in
diagnosing problems, and a link to the releveant dedicated documentation
pages.
Closes #12503
2023-02-13 16:30:24 +02:00