Posted 2023-07-14Updated 2025-01-30Tutorials5 minutes read (About 705 words)

NUMA perf

简介

NUMA使用的目的是为了每个进程能使用local内存来实现高性能。但是假如某进程的local内存提前用完了，会导致无法使用其他进程的内存，反而需要SWAP的问题。(一般小例子遇不到)

https://blog.51cto.com/quantfabric/2594323

https://www.cnblogs.com/machangwei-8/p/10402644.html

NUMA的内存分配策略

缺省(default)：总是在本地节点分配（分配在当前进程运行的节点上）；
绑定(bind)：强制分配到指定节点上；
交叉(interleave)：在所有节点或者指定的节点上交织分配；
优先(preferred)：在指定节点上分配，失败则在其他节点上分配。

因为NUMA默认的内存分配策略是优先在进程所在CPU的本地内存中分配，会导致CPU节点之间内存分配不均衡，当某个CPU节点的内存不足时，会导致swap产生，而不是从远程节点分配内存。这就是所谓的swap insanity 现象。

$ numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
node 0 size: 64076 MB
node 0 free: 23497 MB
node 1 cpus: 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
node 1 size: 64503 MB
node 1 free: 37897 MB
node distances:
node   0   1
  0:  10  21
  1:  21  10

# shaojiemike @ node5 in ~/github/IPCC2022-preliminary/run on git:main o [10:41:54]
$ numactl --show
policy: default
preferred node: current
physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
cpubind: 0 1
nodebind: 0 1
membind: 0 1

常见命令

# 遇到内存不够时
numactl –interleave=all ./exe

# 使用local内存（默认的）
numactl --localalloc ./exe

查看程序的内存的 NUMA情况

在Linux系统上,可以通过以下常用方法来查看和分析程序的NUMA(非统一内存访问)情况:

numastat:查看进程和每个NUMA节点的内存分配和访问统计。
numactl: 查看进程NUMA policy和分配策略,可以手动设置策略。
numa_maps:查看进程在每个NUMA节点上的内存映射情况。
mpstat -P ALL:查看每个CPU核心的统计信息。
pidstat -t:查看进程在每个CPU上的执行时间。
perf stat:统计程序在不同CPU上周期数,检查是否均衡。
likwid-perfctr: 细粒度检测程序在不同内存节点的带宽和延迟。
VTune: Intel的性能分析工具,可以检测NUMA的影响。
代码插桩:统计程序对不同节点内存的访问。
numactl --hardware :查看系统NUMA拓扑结构。

通过综合使用这些工具,可以全面分析程序的NUMA性能,例如内存分布不均,访问模式导致的不均衡等,然后进行针对优化。

c++ malloc时能手动设置内存位置

libnuma: 直接调用libnuma提供的numa_alloc_onnode()和numa_free()等API,在指定节点上分配释放内存。
mmap

需要进一步的研究学习

暂无

遇到的问题

暂无

开题缘由、总结、反思、吐槽~~

参考文献

无

NUMA perf

http://icarus.shaojiemike.top/2023/07/14/Work/software/perf/NUMAperf/

Author

Shaojie Tan

Posted on

2023-07-14

Updated on

2025-01-30

Licensed under

#NUMA perf

Afdian.net Alipay

Buy me a coffee Patreon Wechat

NUMA perf

简介

NUMA的内存分配策略

常见命令

查看程序的内存的 NUMA情况

c++ malloc时能手动设置内存位置

需要进一步的研究学习

遇到的问题

开题缘由、总结、反思、吐槽~~

参考文献

Author

Posted on

Updated on

Licensed under

Like this article? Support the author with

Categories

Subscribe for updates

follow.it

Links

Recents

Archives

Tags

NUMA perf

简介

NUMA的内存分配策略

常见命令

查看程序的内存的 NUMA情况

c++ malloc时能手动设置 内存位置

需要进一步的研究学习

遇到的问题

开题缘由、总结、反思、吐槽~~

参考文献

Author

Posted on

Updated on

Licensed under

Like this article? Support the author with

Categories

Subscribe for updates

follow.it

Links

Recents

Archives

Tags

c++ malloc时能手动设置内存位置