Xe GT Statistics¶
Overview¶
The Xe driver exposes per-GT statistics through the debugfs filesystem at:
/sys/kernel/debug/dri/<device>/gt<id>/stats
This interface requires the kernel to be built with CONFIG_DEBUG_FS=y.
Reading statistics¶
Reading the file prints all available statistics, one per line, in
name: value format:
$ cat /sys/kernel/debug/dri/0/gt0/stats
svm_pagefault_count: 0
tlb_inval_count: 1234
...
All values are 64-bit unsigned integers aggregated across all CPUs. Counters accumulate since the driver was loaded or since the last explicit reset. Timing counters use microseconds as their unit; data volume counters use KiB.
Resetting statistics¶
Writing a boolean true value to the file resets all counters to zero:
echo 1 > /sys/kernel/debug/dri/0/gt0/stats
Any value accepted by kstrtobool() (e.g. 1, y, yes,
on) triggers the reset. Resetting while the GPU is active may yield
unpredictable intermediate values; it is recommended to reset only when
the GPU is idle.
-
enum xe_gt_stats_id¶
GT statistics identifiers
Constants
XE_GT_STATS_ID_SVM_PAGEFAULT_COUNTTotal SVM page faults handled.
XE_GT_STATS_ID_TLB_INVALTotal GPU Translation Lookaside Buffer (TLB) invalidations issued.
XE_GT_STATS_ID_SVM_TLB_INVAL_COUNTTLB invalidations issued during SVM page-fault handling.
XE_GT_STATS_ID_SVM_TLB_INVAL_USCumulative time (µs) waiting for TLB invalidations during SVM page-fault handling.
XE_GT_STATS_ID_VMA_PAGEFAULT_COUNTBuffer-object (non-SVM) page faults handled.
XE_GT_STATS_ID_VMA_PAGEFAULT_KBSize (KiB) of VMAs involved in buffer-object page fault handling.
XE_GT_STATS_ID_INVALID_PREFETCH_PAGEFAULT_COUNTGPU prefetch faults for addresses with no valid backing.
XE_GT_STATS_ID_SVM_4K_PAGEFAULT_COUNTSVM page faults resolved by mapping 4K pages.
XE_GT_STATS_ID_SVM_64K_PAGEFAULT_COUNTSVM page faults resolved by mapping 64K pages.
XE_GT_STATS_ID_SVM_2M_PAGEFAULT_COUNTSVM page faults resolved by mapping 2M pages.
XE_GT_STATS_ID_SVM_4K_VALID_PAGEFAULT_COUNTValid SVM page faults at 4K page size, where the GPU mapping was already valid — resolved without creating new mappings.
XE_GT_STATS_ID_SVM_64K_VALID_PAGEFAULT_COUNTValid SVM page faults at 64K page size.
XE_GT_STATS_ID_SVM_2M_VALID_PAGEFAULT_COUNTValid SVM page faults at 2M page size.
XE_GT_STATS_ID_SVM_4K_PAGEFAULT_USCumulative time (µs) handling 4K SVM page faults.
XE_GT_STATS_ID_SVM_64K_PAGEFAULT_USCumulative time (µs) handling 64K SVM page faults.
XE_GT_STATS_ID_SVM_2M_PAGEFAULT_USCumulative time (µs) handling 2M SVM page faults.
XE_GT_STATS_ID_SVM_4K_MIGRATE_COUNT4K pages moved from CPU to device memory.
XE_GT_STATS_ID_SVM_64K_MIGRATE_COUNT64K pages moved from CPU to device memory.
XE_GT_STATS_ID_SVM_2M_MIGRATE_COUNT2M pages moved from CPU to device memory.
XE_GT_STATS_ID_SVM_4K_MIGRATE_USCumulative time (µs) moving 4K pages from CPU to device memory.
XE_GT_STATS_ID_SVM_64K_MIGRATE_USCumulative time (µs) moving 64K pages from CPU to device memory.
XE_GT_STATS_ID_SVM_2M_MIGRATE_USCumulative time (µs) moving 2M pages from CPU to device memory.
XE_GT_STATS_ID_SVM_DEVICE_COPY_USCumulative time (µs) for memory copies to device, across all page sizes.
XE_GT_STATS_ID_SVM_4K_DEVICE_COPY_USCumulative time (µs) for memory copies of 4K pages to device.
XE_GT_STATS_ID_SVM_64K_DEVICE_COPY_USCumulative time (µs) for memory copies of 64K pages to device.
XE_GT_STATS_ID_SVM_2M_DEVICE_COPY_USCumulative time (µs) for memory copies of 2M pages to device.
XE_GT_STATS_ID_SVM_CPU_COPY_USCumulative time (µs) for memory copies to CPU, across all page sizes.
XE_GT_STATS_ID_SVM_4K_CPU_COPY_USCumulative time (µs) for memory copies of 4K pages to CPU.
XE_GT_STATS_ID_SVM_64K_CPU_COPY_USCumulative time (µs) for memory copies of 64K pages to CPU.
XE_GT_STATS_ID_SVM_2M_CPU_COPY_USCumulative time (µs) for memory copies of 2M pages to CPU.
XE_GT_STATS_ID_SVM_DEVICE_COPY_KBData (KiB) copied to device across all page sizes.
XE_GT_STATS_ID_SVM_4K_DEVICE_COPY_KBData (KiB) copied to device for 4K pages.
XE_GT_STATS_ID_SVM_64K_DEVICE_COPY_KBData (KiB) copied to device for 64K pages.
XE_GT_STATS_ID_SVM_2M_DEVICE_COPY_KBData (KiB) copied to device for 2M pages.
XE_GT_STATS_ID_SVM_CPU_COPY_KBData (KiB) copied to CPU across all page sizes.
XE_GT_STATS_ID_SVM_4K_CPU_COPY_KBData (KiB) copied to CPU for 4K pages.
XE_GT_STATS_ID_SVM_64K_CPU_COPY_KBData (KiB) copied to CPU for 64K pages.
XE_GT_STATS_ID_SVM_2M_CPU_COPY_KBData (KiB) copied to CPU for 2M pages.
XE_GT_STATS_ID_SVM_4K_GET_PAGES_USCumulative time (µs) getting CPU memory pages for GPU access at 4K page size.
XE_GT_STATS_ID_SVM_64K_GET_PAGES_USCumulative time (µs) getting CPU memory pages for GPU access at 64K page size.
XE_GT_STATS_ID_SVM_2M_GET_PAGES_USCumulative time (µs) getting CPU memory pages for GPU access at 2M page size.
XE_GT_STATS_ID_SVM_4K_BIND_USCumulative time (µs) binding 4K pages into the GPU page table.
XE_GT_STATS_ID_SVM_64K_BIND_USCumulative time (µs) binding 64K pages into the GPU page table.
XE_GT_STATS_ID_SVM_2M_BIND_USCumulative time (µs) binding 2M pages into the GPU page table.
XE_GT_STATS_ID_HW_ENGINE_GROUP_SUSPEND_LR_QUEUE_COUNTTimes the scheduler preempted a long-running (LR) GPU exec queue.
XE_GT_STATS_ID_HW_ENGINE_GROUP_SKIP_LR_QUEUE_COUNTTimes the scheduler skipped suspend because the system was idle.
XE_GT_STATS_ID_HW_ENGINE_GROUP_WAIT_DMA_QUEUE_COUNTTimes the driver stalled waiting for prior GPU work to complete before scheduling more.
XE_GT_STATS_ID_HW_ENGINE_GROUP_SUSPEND_LR_QUEUE_USCumulative time (µs) spent preempting long-running (LR) GPU exec queues.
XE_GT_STATS_ID_HW_ENGINE_GROUP_WAIT_DMA_QUEUE_USCumulative time (µs) stalled waiting for prior GPU work to complete.
XE_GT_STATS_ID_PRL_4K_ENTRY_COUNT4K-page entries from the page reclaim list that were processed.
XE_GT_STATS_ID_PRL_64K_ENTRY_COUNT64K-page entries from the page reclaim list that were processed.
XE_GT_STATS_ID_PRL_2M_ENTRY_COUNT2M-page entries from the page reclaim list that were processed.
XE_GT_STATS_ID_PRL_ISSUED_COUNTTimes a page reclamation was issued.
XE_GT_STATS_ID_PRL_ABORTED_COUNTTimes the page reclaim process was aborted.
__XE_GT_STATS_NUM_IDSNumber of valid IDs; not a real counter.
Description
See Xe GT Statistics.
-
struct xe_gt_stats¶
Per-CPU GT statistics counters
Definition:
struct xe_gt_stats {
u64 counters[__XE_GT_STATS_NUM_IDS];
};
Members
countersArray of 64-bit counters indexed by
enum xe_gt_stats_id
Description
This structure is used for high-frequency, per-CPU statistics collection in the Xe driver. By using a per-CPU allocation and ensuring the structure is cache-line aligned, we avoid the performance-heavy atomics and cache coherency traffic.
Updates to these counters should be performed using the this_cpu_add()
macro to ensure they are atomic with respect to local interrupts and
preemption-safe without the overhead of explicit locking.