ctrl + alt + F3 #jump into command line login su - {user-name} sudo -s sudo -i # If invoked without a user name, su defaults to becoming the superuser ip a |less #check ip address fjw弄了静态IP就没这个问题了
限制当前shell用户爆内存
宕机一般是爆内存,进程分配肯定会注意不超过物理核个数。
在zshrc里写入 25*1024*1024 = 25GB的内存上限
1
ulimit -v 26214400
当前shell程序超内存,会输出Memory Error结束。
测试读取200GB大文件到内存
1 2 3
with open("/home/shaojiemike/test/DynamoRIO/OpenBLASRawAssembly/openblas_utest.log", 'r') as f: data= f.readlines() print(len(data))
import torch from torch.nn import Sequential as Seq, Linear, ReLU from torch_geometric.nn import MessagePassing from torch_geometric.utils import remove_self_loops, add_self_loops classSAGEConv(MessagePassing): def__init__(self, in_channels, out_channels): super(SAGEConv, self).__init__(aggr='max') # "Max" aggregation. self.lin = torch.nn.Linear(in_channels, out_channels) self.act = torch.nn.ReLU() self.update_lin = torch.nn.Linear(in_channels + out_channels, in_channels, bias=False) self.update_act = torch.nn.ReLU() defforward(self, x, edge_index): # x has shape [N, in_channels] # edge_index has shape [2, E] # Removes every self-loop in the graph given by edge_index, so that (i,i)∉E for every i ∈ V. edge_index, _ = remove_self_loops(edge_index) # Adds a self-loop (i,i)∈ E to every node i ∈ V in the graph given by edge_index edge_index, _ = add_self_loops(edge_index, num_nodes=x.size(0)) returnself.propagate(edge_index, size=(x.size(0), x.size(0)), x=x) defmessage(self, x_j): # x_j has shape [E, in_channels] x_j = self.lin(x_j) x_j = self.act(x_j) return x_j defupdate(self, aggr_out, x): # aggr_out has shape [N, out_channels] new_embedding = torch.cat([aggr_out, x], dim=1) new_embedding = self.update_lin(new_embedding) new_embedding = self.update_act(new_embedding) return new_embedding
CPU(s):64 = the number of logical cores = “Thread(s) per core” × “Core(s) per socket” × “Socket(s)” = 1 * 32 * 2
One socket is one physical CPU package (which occupies one socket on the motherboard);
each socket hosts a number of physical cores, and each core can run one or more threads.
In this case, you have two sockets, each containing a 32-core AMD EPYC 7452 CPU, and since that not supports hyper-threading, each core just run a thread.
syscall: SYSCALL (Fast System Call) and SYSRET (Return From Fast System Call) nx:执行禁用 # NX 位(不执行)是 CPU 中使用的一项技术,用于分隔内存区域,以供处理器指令(代码)存储或数据存储使用 mmxext: AMD MMX extensions fxsr_opt: FXSAVE/FXRSTOR optimizations pdpe1gb: One GB pages (allows hugepagesz=1G) rdtscp: Read Time-Stamp Counter and Processor ID lm: Long Mode (x86-64: amd64, also known as Intel 64, i.e. 64-bit capable)
constant_tsc:TSC(Time Stamp Counter) 以恒定速率滴答 art: Always-Running Timer rep_good:rep 微码运行良好 nopl: The NOPL (0F 1F) instructions # NOPL is long-sized bytes "do nothing" operation nonstop_tsc: TSC does not stop in C states extd_apicid: has extended APICID (8 bits) (Advanced Programmable Interrupt Controller) aperfmperf: APERFMPERF # On x86 hardware, APERF and MPERF are MSR registers that can provide feedback on current CPU frequency. eagerfpu: Non lazy FPU restore
Intel-defined CPU features, CPUID level 0x00000001 (ecx)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
pni: SSE-3 (“2004年,新内核Prescott New Instructions”) pclmulqdq: 执行四字指令的无进位乘法 - GCM 的加速器) monitor: Monitor/Mwait support (Intel SSE3 supplements) ssse3:补充 SSE-3 fma:融合乘加 cx16: CMPXCHG16B # double-width compare-and-swap (DWCAS) implemented by instructions such as x86 CMPXCHG16B sse4_1:SSE-4.1 sse4_2:SSE-4.2 x2apic: x2APIC movbe:交换字节指令后移动数据 popcnt:返回设置为1指令的位数的计数(汉明权,即位计数) aes/aes-ni:高级加密标准(新指令) xsave:保存处理器扩展状态:还提供 XGETBY、XRSTOR、XSETBY avx:高级矢量扩展 f16c:16 位 fp 转换 (CVT16) rdrand:从硬件随机数生成器指令中读取随机数
More extended AMD flags: CPUID level 0x80000001, ecx