Skip to content

Commit e4dba78

Browse files
authored
Standardize metric units and improve documentation sync (#2588)
- Standardized unit capitalization across all architectures (gfx908, gfx90a, gfx940, gfx941, gfx942, gfx950): 'Work-Items' → 'Work-items', capitalized words after 'per' (e.g., 'per cycle' → 'per Cycle') - Pluralized operation counts: 'FLOP' → 'FLOPs', 'IOP' → 'IOPS' - Fixed metric extraction to handle both 'unit' and 'units' field names - Fixed section assignment for metrics without units in documentation - Restored UTCL1 and vL1D sections - Updated .gitignore to exclude .backups/ directory - Regenerated delta files, config hashes, per-arch definitions, and docs - Reformat delta files to ensure deterministic edits in the future by config. mgmt. workflow
1 parent a4e2e40 commit e4dba78

File tree

84 files changed

+4420
-10897
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

84 files changed

+4420
-10897
lines changed

projects/rocprofiler-compute/.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
# mongodb_connector files
22
__pycache__
3+
.cline_storage
34

45
# edit files
56
*~
@@ -23,3 +24,6 @@ VERSION.sha
2324
# documentation artifacts
2425
/_build
2526
_toc.yml
27+
28+
# Backup directories
29+
.backups/

projects/rocprofiler-compute/CHANGELOG.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,9 @@ Full documentation for ROCm Compute Profiler is available at [https://rocm.docs.
2222
* Roofline binaries compiled from [rocm-amdgpu-bench](https://github.com/ROCm/rocm-amdgpu-bench) repository have been removed from the project, as Roofline runtime compilation performs the same work as the Roofline binaries.
2323
* You can collect standalone Roofline empirical peaks without running the entire ROCm Compute Profiler's profile mode, through an entry point in [benchmark.py](https://github.com/ROCm/rocm-systems/blob/HEAD/projects/rocprofiler-compute/src/utils/benchmark.py). Running the `benchmark.py` Python file replaces calling standalone Roofline binary.
2424

25+
* Synced latest metric descriptions to public facing documentation
26+
* Updated metric units to be more human readable in public facing documentation
27+
2528
### Changed
2629

2730
* Default output format for the underlying ROCprofiler-SDK tool has been changed from ``csv`` to ``rocpd``.

projects/rocprofiler-compute/docs/data/metrics_description.yaml

Lines changed: 663 additions & 1363 deletions
Large diffs are not rendered by default.

projects/rocprofiler-compute/src/rocprof_compute_soc/analysis_configs/gfx908/0200_system_speed_of_light.yaml

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -55,39 +55,39 @@ Panel Config:
5555
pop: ((100 * $numActiveCUs) / $cu_per_gpu)
5656
SALU Utilization:
5757
value: AVG(((100 * SQ_ACTIVE_INST_SCA) / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu)))
58-
unit: pct
58+
unit: Percent
5959
peak: 100
6060
pop: AVG(((100 * SQ_ACTIVE_INST_SCA) / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu)))
6161
VALU Utilization:
6262
value: AVG(((100 * SQ_ACTIVE_INST_VALU) / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu)))
63-
unit: pct
63+
unit: Percent
6464
peak: 100
6565
pop: AVG(((100 * SQ_ACTIVE_INST_VALU) / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu)))
6666
MFMA Utilization:
6767
value: None
68-
unit: pct
68+
unit: Percent
6969
peak: 100
7070
pop: None
7171
VMEM Utilization:
7272
value: None
73-
unit: pct
73+
unit: Percent
7474
peak: 100
7575
pop: None
7676
Branch Utilization:
7777
value: None
78-
unit: pct
78+
unit: Percent
7979
peak: 100
8080
pop: None
8181
VALU Active Threads:
8282
value: AVG(((SQ_THREAD_CYCLES_VALU / SQ_ACTIVE_INST_VALU) if (SQ_ACTIVE_INST_VALU
8383
!= 0) else None))
84-
unit: Threads
84+
unit: Work-items
8585
peak: $wave_size
8686
pop: (100 * AVG((SQ_THREAD_CYCLES_VALU / SQ_ACTIVE_INST_VALU / $wave_size)
8787
if (SQ_ACTIVE_INST_VALU != 0) else None))
8888
IPC:
8989
value: AVG((SQ_INSTS / SQ_BUSY_CU_CYCLES))
90-
unit: Instr/cycle
90+
unit: Instructions per Cycle
9191
peak: 5
9292
pop: ((100 * AVG((SQ_INSTS / SQ_BUSY_CU_CYCLES))) / 5)
9393
Wavefront Occupancy:
@@ -107,7 +107,7 @@ Panel Config:
107107
LDS Bank Conflicts/Access:
108108
value: AVG(((SQ_LDS_BANK_CONFLICT / (SQ_LDS_IDX_ACTIVE - SQ_LDS_BANK_CONFLICT))
109109
if ((SQ_LDS_IDX_ACTIVE - SQ_LDS_BANK_CONFLICT) != 0) else None))
110-
unit: Conflicts/access
110+
unit: Conflicts per Access
111111
peak: 32
112112
pop: ((100 * AVG(((SQ_LDS_BANK_CONFLICT / (SQ_LDS_IDX_ACTIVE - SQ_LDS_BANK_CONFLICT))
113113
if ((SQ_LDS_IDX_ACTIVE - SQ_LDS_BANK_CONFLICT) != 0) else None))) / 32)
@@ -116,7 +116,7 @@ Panel Config:
116116
+ TCP_TCC_ATOMIC_WITH_RET_REQ_sum) + TCP_TCC_ATOMIC_WITHOUT_RET_REQ_sum))
117117
/ TCP_TOTAL_CACHE_ACCESSES_sum)) if (TCP_TOTAL_CACHE_ACCESSES_sum != 0)
118118
else None))
119-
unit: pct
119+
unit: Percent
120120
peak: 100
121121
pop: AVG(((100 - ((100 * (((TCP_TCC_READ_REQ_sum + TCP_TCC_WRITE_REQ_sum)
122122
+ TCP_TCC_ATOMIC_WITH_RET_REQ_sum) + TCP_TCC_ATOMIC_WITHOUT_RET_REQ_sum))
@@ -131,7 +131,7 @@ Panel Config:
131131
L2 Cache Hit Rate:
132132
value: AVG((((100 * TCC_HIT_sum) / (TCC_HIT_sum + TCC_MISS_sum)) if ((TCC_HIT_sum
133133
+ TCC_MISS_sum) != 0) else None))
134-
unit: pct
134+
unit: Percent
135135
peak: 100
136136
pop: AVG((((100 * TCC_HIT_sum) / (TCC_HIT_sum + TCC_MISS_sum)) if ((TCC_HIT_sum
137137
+ TCC_MISS_sum) != 0) else None))
@@ -172,7 +172,7 @@ Panel Config:
172172
sL1D Cache Hit Rate:
173173
value: AVG((((100 * SQC_DCACHE_HITS) / (SQC_DCACHE_HITS + SQC_DCACHE_MISSES))
174174
if ((SQC_DCACHE_HITS + SQC_DCACHE_MISSES) != 0) else None))
175-
unit: pct
175+
unit: Percent
176176
peak: 100
177177
pop: AVG((((100 * SQC_DCACHE_HITS) / (SQC_DCACHE_HITS + SQC_DCACHE_MISSES))
178178
if ((SQC_DCACHE_HITS + SQC_DCACHE_MISSES) != 0) else None))
@@ -184,7 +184,7 @@ Panel Config:
184184
64))) / ((($max_sclk / 1000) * 64) * $sqc_per_gpu))
185185
L1I Hit Rate:
186186
value: AVG(((100 * SQC_ICACHE_HITS) / (SQC_ICACHE_HITS + SQC_ICACHE_MISSES)))
187-
unit: pct
187+
unit: Percent
188188
peak: 100
189189
pop: AVG(((100 * SQC_ICACHE_HITS) / (SQC_ICACHE_HITS + SQC_ICACHE_MISSES)))
190190
L1I BW:
@@ -201,7 +201,7 @@ Panel Config:
201201
coll_level: SQ_IFETCH_LEVEL
202202
CU Utilization:
203203
value: AVG(100 * SQ_BUSY_CU_CYCLES / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu))
204-
unit: Pct
204+
unit: Percent
205205
peak: 100
206206
pop: AVG(100 * SQ_BUSY_CU_CYCLES / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu))
207207
metrics_description:

projects/rocprofiler-compute/src/rocprof_compute_soc/analysis_configs/gfx908/0500_command_processor_cpc_cpf.yaml

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -20,39 +20,39 @@ Panel Config:
2020
if ((CPF_CPF_STAT_BUSY + CPF_CPF_STAT_IDLE) != 0) else None))
2121
max: MAX((((100 * CPF_CPF_STAT_BUSY) / (CPF_CPF_STAT_BUSY + CPF_CPF_STAT_IDLE))
2222
if ((CPF_CPF_STAT_BUSY + CPF_CPF_STAT_IDLE) != 0) else None))
23-
unit: pct
23+
unit: Percent
2424
CPF Stall:
2525
avg: AVG((((100 * CPF_CPF_STAT_STALL) / CPF_CPF_STAT_BUSY) if (CPF_CPF_STAT_BUSY
2626
!= 0) else None))
2727
min: MIN((((100 * CPF_CPF_STAT_STALL) / CPF_CPF_STAT_BUSY) if (CPF_CPF_STAT_BUSY
2828
!= 0) else None))
2929
max: MAX((((100 * CPF_CPF_STAT_STALL) / CPF_CPF_STAT_BUSY) if (CPF_CPF_STAT_BUSY
3030
!= 0) else None))
31-
unit: pct
31+
unit: Percent
3232
CPF-L2 Utilization:
3333
avg: AVG((((100 * CPF_CPF_TCIU_BUSY) / (CPF_CPF_TCIU_BUSY + CPF_CPF_TCIU_IDLE))
3434
if ((CPF_CPF_TCIU_BUSY + CPF_CPF_TCIU_IDLE) != 0) else None))
3535
min: MIN((((100 * CPF_CPF_TCIU_BUSY) / (CPF_CPF_TCIU_BUSY + CPF_CPF_TCIU_IDLE))
3636
if ((CPF_CPF_TCIU_BUSY + CPF_CPF_TCIU_IDLE) != 0) else None))
3737
max: MAX((((100 * CPF_CPF_TCIU_BUSY) / (CPF_CPF_TCIU_BUSY + CPF_CPF_TCIU_IDLE))
3838
if ((CPF_CPF_TCIU_BUSY + CPF_CPF_TCIU_IDLE) != 0) else None))
39-
unit: pct
39+
unit: Percent
4040
CPF-L2 Stall:
4141
avg: AVG((((100 * CPF_CPF_TCIU_STALL) / CPF_CPF_TCIU_BUSY) if (CPF_CPF_TCIU_BUSY
4242
!= 0) else None))
4343
min: MIN((((100 * CPF_CPF_TCIU_STALL) / CPF_CPF_TCIU_BUSY) if (CPF_CPF_TCIU_BUSY
4444
!= 0) else None))
4545
max: MAX((((100 * CPF_CPF_TCIU_STALL) / CPF_CPF_TCIU_BUSY) if (CPF_CPF_TCIU_BUSY
4646
!= 0) else None))
47-
unit: pct
47+
unit: Percent
4848
CPF-UTCL1 Stall:
4949
avg: AVG(((100 * CPF_CMP_UTCL1_STALL_ON_TRANSLATION) / CPF_CPF_STAT_BUSY)
5050
if (CPF_CPF_STAT_BUSY != 0) else None)
5151
min: MIN(((100 * CPF_CMP_UTCL1_STALL_ON_TRANSLATION) / CPF_CPF_STAT_BUSY)
5252
if (CPF_CPF_STAT_BUSY != 0) else None)
5353
max: MAX(((100 * CPF_CMP_UTCL1_STALL_ON_TRANSLATION) / CPF_CPF_STAT_BUSY)
5454
if (CPF_CPF_STAT_BUSY != 0) else None)
55-
unit: pct
55+
unit: Percent
5656
- metric_table:
5757
id: 502
5858
title: Command processor packet processor (CPC)
@@ -70,55 +70,55 @@ Panel Config:
7070
if ((CPC_CPC_STAT_BUSY + CPC_CPC_STAT_IDLE) != 0) else None))
7171
max: MAX((((100 * CPC_CPC_STAT_BUSY) / (CPC_CPC_STAT_BUSY + CPC_CPC_STAT_IDLE))
7272
if ((CPC_CPC_STAT_BUSY + CPC_CPC_STAT_IDLE) != 0) else None))
73-
unit: pct
73+
unit: Percent
7474
CPC Stall Rate:
7575
avg: AVG((((100 * CPC_CPC_STAT_STALL) / CPC_CPC_STAT_BUSY) if (CPC_CPC_STAT_BUSY
7676
!= 0) else None))
7777
min: MIN((((100 * CPC_CPC_STAT_STALL) / CPC_CPC_STAT_BUSY) if (CPC_CPC_STAT_BUSY
7878
!= 0) else None))
7979
max: MAX((((100 * CPC_CPC_STAT_STALL) / CPC_CPC_STAT_BUSY) if (CPC_CPC_STAT_BUSY
8080
!= 0) else None))
81-
unit: pct
81+
unit: Percent
8282
CPC Packet Decoding Utilization:
8383
avg: AVG((100 * CPC_ME1_BUSY_FOR_PACKET_DECODE) / CPC_CPC_STAT_BUSY if (CPC_CPC_STAT_BUSY
8484
!= 0) else None)
8585
min: MIN((100 * CPC_ME1_BUSY_FOR_PACKET_DECODE) / CPC_CPC_STAT_BUSY if (CPC_CPC_STAT_BUSY
8686
!= 0) else None)
8787
max: MAX((100 * CPC_ME1_BUSY_FOR_PACKET_DECODE) / CPC_CPC_STAT_BUSY if (CPC_CPC_STAT_BUSY
8888
!= 0) else None)
89-
unit: pct
89+
unit: Percent
9090
CPC-Workgroup Manager Utilization:
9191
avg: AVG((100 * CPC_ME1_DC0_SPI_BUSY) / CPC_CPC_STAT_BUSY if (CPC_CPC_STAT_BUSY
9292
!= 0) else None)
9393
min: MIN((100 * CPC_ME1_DC0_SPI_BUSY) / CPC_CPC_STAT_BUSY if (CPC_CPC_STAT_BUSY
9494
!= 0) else None)
9595
max: MAX((100 * CPC_ME1_DC0_SPI_BUSY) / CPC_CPC_STAT_BUSY if (CPC_CPC_STAT_BUSY
9696
!= 0) else None)
97-
unit: Pct
97+
unit: Percent
9898
CPC-L2 Utilization:
9999
avg: AVG((((100 * CPC_CPC_TCIU_BUSY) / (CPC_CPC_TCIU_BUSY + CPC_CPC_TCIU_IDLE))
100100
if ((CPC_CPC_TCIU_BUSY + CPC_CPC_TCIU_IDLE) != 0) else None))
101101
min: MIN((((100 * CPC_CPC_TCIU_BUSY) / (CPC_CPC_TCIU_BUSY + CPC_CPC_TCIU_IDLE))
102102
if ((CPC_CPC_TCIU_BUSY + CPC_CPC_TCIU_IDLE) != 0) else None))
103103
max: MAX((((100 * CPC_CPC_TCIU_BUSY) / (CPC_CPC_TCIU_BUSY + CPC_CPC_TCIU_IDLE))
104104
if ((CPC_CPC_TCIU_BUSY + CPC_CPC_TCIU_IDLE) != 0) else None))
105-
unit: pct
105+
unit: Percent
106106
CPC-UTCL1 Stall:
107107
avg: AVG(((100 * CPC_UTCL1_STALL_ON_TRANSLATION) / CPC_CPC_STAT_BUSY) if
108108
(CPC_CPC_STAT_BUSY != 0) else None)
109109
min: MIN(((100 * CPC_UTCL1_STALL_ON_TRANSLATION) / CPC_CPC_STAT_BUSY) if
110110
(CPC_CPC_STAT_BUSY != 0) else None)
111111
max: MAX(((100 * CPC_UTCL1_STALL_ON_TRANSLATION) / CPC_CPC_STAT_BUSY) if
112112
(CPC_CPC_STAT_BUSY != 0) else None)
113-
unit: pct
113+
unit: Percent
114114
CPC-UTCL2 Utilization:
115115
avg: AVG((((100 * CPC_CPC_UTCL2IU_BUSY) / (CPC_CPC_UTCL2IU_BUSY + CPC_CPC_UTCL2IU_IDLE))
116116
if ((CPC_CPC_UTCL2IU_BUSY + CPC_CPC_UTCL2IU_IDLE) != 0) else None))
117117
min: MIN((((100 * CPC_CPC_UTCL2IU_BUSY) / (CPC_CPC_UTCL2IU_BUSY + CPC_CPC_UTCL2IU_IDLE))
118118
if ((CPC_CPC_UTCL2IU_BUSY + CPC_CPC_UTCL2IU_IDLE) != 0) else None))
119119
max: MAX((((100 * CPC_CPC_UTCL2IU_BUSY) / (CPC_CPC_UTCL2IU_BUSY + CPC_CPC_UTCL2IU_IDLE))
120120
if ((CPC_CPC_UTCL2IU_BUSY + CPC_CPC_UTCL2IU_IDLE) != 0) else None))
121-
unit: pct
121+
unit: Percent
122122
metrics_description:
123123
CPF Utilization: Percent of total cycles where the CPF was busy actively doing
124124
any work. The ratio of CPF busy cycles over total cycles counted by the CPF.

projects/rocprofiler-compute/src/rocprof_compute_soc/analysis_configs/gfx908/0600_workgroup_manager_spi.yaml

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -17,30 +17,30 @@ Panel Config:
1717
avg: AVG(100 * $GRBM_GUI_ACTIVE_PER_XCD / $GRBM_COUNT_PER_XCD)
1818
min: MIN(100 * $GRBM_GUI_ACTIVE_PER_XCD / $GRBM_COUNT_PER_XCD)
1919
max: MAX(100 * $GRBM_GUI_ACTIVE_PER_XCD / $GRBM_COUNT_PER_XCD)
20-
unit: Pct
20+
unit: Percent
2121
Scheduler-Pipe Utilization:
2222
avg: AVG(100 * SPI_CSN_BUSY / ($GRBM_GUI_ACTIVE_PER_XCD * $pipes_per_gpu
2323
* $se_per_gpu))
2424
min: MIN(100 * SPI_CSN_BUSY / ($GRBM_GUI_ACTIVE_PER_XCD * $pipes_per_gpu
2525
* $se_per_gpu))
2626
max: MAX(100 * SPI_CSN_BUSY / ($GRBM_GUI_ACTIVE_PER_XCD * $pipes_per_gpu
2727
* $se_per_gpu))
28-
unit: Pct
28+
unit: Percent
2929
Workgroup Manager Utilization:
3030
avg: AVG(100 * $GRBM_SPI_BUSY_PER_XCD / $GRBM_GUI_ACTIVE_PER_XCD)
3131
min: MIN(100 * $GRBM_SPI_BUSY_PER_XCD / $GRBM_GUI_ACTIVE_PER_XCD)
3232
max: MAX(100 * $GRBM_SPI_BUSY_PER_XCD / $GRBM_GUI_ACTIVE_PER_XCD)
33-
unit: Pct
33+
unit: Percent
3434
Shader Engine Utilization:
3535
avg: AVG(100 * SQ_BUSY_CYCLES / ($GRBM_GUI_ACTIVE_PER_XCD * $se_per_gpu))
3636
min: MIN(100 * SQ_BUSY_CYCLES / ($GRBM_GUI_ACTIVE_PER_XCD * $se_per_gpu))
3737
max: MAX(100 * SQ_BUSY_CYCLES / ($GRBM_GUI_ACTIVE_PER_XCD * $se_per_gpu))
38-
unit: Pct
38+
unit: Percent
3939
SIMD Utilization:
4040
avg: AVG(100 * SQ_BUSY_CU_CYCLES / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu))
4141
min: MIN(100 * SQ_BUSY_CU_CYCLES / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu))
4242
max: MAX(100 * SQ_BUSY_CU_CYCLES / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu))
43-
unit: Pct
43+
unit: Percent
4444
Dispatched Workgroups:
4545
avg: AVG(SPI_CSN_NUM_THREADGROUPS)
4646
min: MIN(SPI_CSN_NUM_THREADGROUPS)
@@ -84,66 +84,66 @@ Panel Config:
8484
if ($GRBM_SPI_BUSY_PER_XCD != 0) else None)
8585
max: MAX((100 * SPI_RA_REQ_NO_ALLOC_CSN / ($GRBM_SPI_BUSY_PER_XCD * $se_per_gpu))
8686
if ($GRBM_SPI_BUSY_PER_XCD != 0) else None)
87-
unit: Pct
87+
unit: Percent
8888
Not-scheduled Rate (Scheduler-Pipe):
8989
avg: AVG((100 * SPI_RA_REQ_NO_ALLOC / ($GRBM_SPI_BUSY_PER_XCD * $se_per_gpu))
9090
if ($GRBM_SPI_BUSY_PER_XCD != 0) else None)
9191
min: MIN((100 * SPI_RA_REQ_NO_ALLOC / ($GRBM_SPI_BUSY_PER_XCD * $se_per_gpu))
9292
if ($GRBM_SPI_BUSY_PER_XCD != 0) else None)
9393
max: MAX((100 * SPI_RA_REQ_NO_ALLOC / ($GRBM_SPI_BUSY_PER_XCD * $se_per_gpu))
9494
if ($GRBM_SPI_BUSY_PER_XCD != 0) else None)
95-
unit: Pct
95+
unit: Percent
9696
Scheduler-Pipe Stall Rate:
9797
avg: AVG((((100 * SPI_RA_RES_STALL_CSN) / ($GRBM_SPI_BUSY_PER_XCD * $se_per_gpu))
9898
if ($GRBM_SPI_BUSY_PER_XCD != 0) else None))
9999
min: MIN((((100 * SPI_RA_RES_STALL_CSN) / ($GRBM_SPI_BUSY_PER_XCD * $se_per_gpu))
100100
if ($GRBM_SPI_BUSY_PER_XCD != 0) else None))
101101
max: MAX((((100 * SPI_RA_RES_STALL_CSN) / ($GRBM_SPI_BUSY_PER_XCD * $se_per_gpu))
102102
if ($GRBM_SPI_BUSY_PER_XCD != 0) else None))
103-
unit: Pct
103+
unit: Percent
104104
Scratch Stall Rate:
105105
avg: AVG((100 * SPI_RA_TMP_STALL_CSN / ($GRBM_SPI_BUSY_PER_XCD * $se_per_gpu))
106106
if ($GRBM_SPI_BUSY_PER_XCD != 0) else None)
107107
min: MIN((100 * SPI_RA_TMP_STALL_CSN / ($GRBM_SPI_BUSY_PER_XCD * $se_per_gpu))
108108
if ($GRBM_SPI_BUSY_PER_XCD != 0) else None)
109109
max: MAX((100 * SPI_RA_TMP_STALL_CSN / ($GRBM_SPI_BUSY_PER_XCD * $se_per_gpu))
110110
if ($GRBM_SPI_BUSY_PER_XCD != 0) else None)
111-
unit: Pct
111+
unit: Percent
112112
Insufficient SIMD Waveslots:
113113
avg: AVG(100 * SPI_RA_WAVE_SIMD_FULL_CSN / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu))
114114
min: MIN(100 * SPI_RA_WAVE_SIMD_FULL_CSN / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu))
115115
max: MAX(100 * SPI_RA_WAVE_SIMD_FULL_CSN / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu))
116-
unit: Pct
116+
unit: Percent
117117
Insufficient SIMD VGPRs:
118118
avg: AVG(100 * SPI_RA_VGPR_SIMD_FULL_CSN / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu))
119119
min: MIN(100 * SPI_RA_VGPR_SIMD_FULL_CSN / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu))
120120
max: MAX(100 * SPI_RA_VGPR_SIMD_FULL_CSN / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu))
121-
unit: Pct
121+
unit: Percent
122122
Insufficient SIMD SGPRs:
123123
avg: AVG(100 * SPI_RA_SGPR_SIMD_FULL_CSN / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu))
124124
min: MIN(100 * SPI_RA_SGPR_SIMD_FULL_CSN / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu))
125125
max: MAX(100 * SPI_RA_SGPR_SIMD_FULL_CSN / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu))
126-
unit: Pct
126+
unit: Percent
127127
Insufficient CU LDS:
128128
avg: AVG(400 * SPI_RA_LDS_CU_FULL_CSN / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu))
129129
min: MIN(400 * SPI_RA_LDS_CU_FULL_CSN / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu))
130130
max: MAX(400 * SPI_RA_LDS_CU_FULL_CSN / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu))
131-
unit: Pct
131+
unit: Percent
132132
Insufficient CU Barriers:
133133
avg: AVG(400 * SPI_RA_BAR_CU_FULL_CSN / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu))
134134
min: MIN(400 * SPI_RA_BAR_CU_FULL_CSN / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu))
135135
max: MAX(400 * SPI_RA_BAR_CU_FULL_CSN / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu))
136-
unit: Pct
136+
unit: Percent
137137
Reached CU Workgroup Limit:
138138
avg: AVG(400 * SPI_RA_TGLIM_CU_FULL_CSN / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu))
139139
min: MIN(400 * SPI_RA_TGLIM_CU_FULL_CSN / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu))
140140
max: MAX(400 * SPI_RA_TGLIM_CU_FULL_CSN / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu))
141-
unit: Pct
141+
unit: Percent
142142
Reached CU Wavefront Limit:
143143
avg: AVG(400 * SPI_RA_WVLIM_STALL_CSN / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu))
144144
min: MIN(400 * SPI_RA_WVLIM_STALL_CSN / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu))
145145
max: MAX(400 * SPI_RA_WVLIM_STALL_CSN / ($GRBM_GUI_ACTIVE_PER_XCD * $cu_per_gpu))
146-
unit: Pct
146+
unit: Percent
147147
metrics_description:
148148
Accelerator Utilization: The percent of cycles in the kernel where the accelerator
149149
was actively doing any work.

projects/rocprofiler-compute/src/rocprof_compute_soc/analysis_configs/gfx908/0700_wavefront.yaml

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -17,12 +17,12 @@ Panel Config:
1717
avg: AVG(Grid_Size)
1818
min: MIN(Grid_Size)
1919
max: MAX(Grid_Size)
20-
unit: Work Items
20+
unit: Work-items
2121
Workgroup Size:
2222
avg: AVG(Workgroup_Size)
2323
min: MIN(Workgroup_Size)
2424
max: MAX(Workgroup_Size)
25-
unit: Work Items
25+
unit: Work-items
2626
Total Wavefronts:
2727
avg: AVG(SPI_CSN_WAVE)
2828
min: MIN(SPI_CSN_WAVE)
@@ -62,7 +62,7 @@ Panel Config:
6262
avg: AVG(Scratch_Per_Workitem)
6363
min: MIN(Scratch_Per_Workitem)
6464
max: MAX(Scratch_Per_Workitem)
65-
unit: Bytes/Workitem
65+
unit: Bytes per Work-item
6666
- metric_table:
6767
id: 702
6868
title: Wavefront Runtime Stats
@@ -77,17 +77,17 @@ Panel Config:
7777
avg: AVG((End_Timestamp - Start_Timestamp))
7878
min: MIN((End_Timestamp - Start_Timestamp))
7979
max: MAX((End_Timestamp - Start_Timestamp))
80-
unit: ns
80+
unit: Nanoseconds
8181
Kernel Time (Cycles):
8282
avg: AVG($GRBM_GUI_ACTIVE_PER_XCD)
8383
min: MIN($GRBM_GUI_ACTIVE_PER_XCD)
8484
max: MAX($GRBM_GUI_ACTIVE_PER_XCD)
85-
unit: Cycle
85+
unit: Cycles
8686
Instructions per wavefront:
8787
avg: AVG((SQ_INSTS / SQ_WAVES))
8888
min: MIN((SQ_INSTS / SQ_WAVES))
8989
max: MAX((SQ_INSTS / SQ_WAVES))
90-
unit: Instr/wavefront
90+
unit: Instructions per Wavefront
9191
Wave Cycles:
9292
avg: AVG(((4 * SQ_WAVE_CYCLES) / $denom))
9393
min: MIN(((4 * SQ_WAVE_CYCLES) / $denom))

0 commit comments

Comments
 (0)