内存管理¶
- 内存是怎样一步步被分配出来的?
- [务必收藏] Linux用户空间和内核空间所有15种内存分配方法总结
- linux内存管理笔记(三十)----进程虚拟地址
- [内核内存] 用户态进程虚拟内存管理
- task_struct 结构,mm_struct 结构, vm_area_struct 结构
- linux内核编程之二:vm_area_struct结构体
- 内核页表共享
- 随笔分类 - Linux内存管理
- 聊聊Linux内核
- 从 Linux 内核角度探秘 JDK NIO 文件读写本质
- 聊聊Netty那些事儿之从内核角度看IO模型
- 一步一图带你深入理解 Linux 虚拟内存管理
-
4.4 NUMA 节点中的内存规整与回收
内存可以说是计算机系统中最为宝贵的资源了,再怎么多也不够用,当系统运行时间长了之后,难免会遇到内存紧张的时候,这时候就需要内核将那些不经常使用的内存页面回收起来,或者将那些可以迁移的页面进行内存规整,从而可以腾出连续的物理内存页面供内核分配。
内核会为每个 NUMA 节点分配一个 kswapd 进程用于回收不经常使用的页面,还会为每个 NUMA 节点分配一个 kcompactd 进程用于内存的规整避免内存碎片。
typedef struct pglist_data { ......... // 页面回收进程 struct task_struct *kswapd; wait_queue_head_t kswapd_wait; // 内存规整进程 struct task_struct *kcompactd; wait_queue_head_t kcompactd_wait; .......... } pg_data_t;NUMA 节点描述符 struct pglist_data 结构中的 struct task_struct *kswapd 属性用于指向内核为 NUMA 节点分配的 kswapd 进程。
kswapd_wait 用于 kswapd 进程周期性回收页面时使用到的等待队列。
同理 struct task_struct *kcompactd 用于指向内核为 NUMA 节点分配的 kcompactd 进程。
kcompactd_wait 用于 kcompactd 进程周期性规整内存时使用到的等待队列。
5.3 水位线的计算
通常情况下 WMARK_LOW 的值是 WMARK_MIN 的 1.25 倍,WMARK_HIGH 的值是 WMARK_LOW 的 1.5 倍。而 WMARK_MIN 的数值就是由这个内核参数 min_free_kbytes 来决定的。
5.5 setup_per_zone_wmarks 计算水位线
为了避免内核的直接内存回收 direct reclaim 阻塞进程影响系统的性能,所以我们需要尽量保持内存区域中的剩余内存容量尽量在 WMARK_MIN 水位线之上,但是有一些极端情况,比如突然遇到网络流量增大,需要短时间内申请大量的内存来存放网络请求数据,此时 kswapd 回收内存的速度可能赶不上内存分配的速度,从而造成直接内存回收 direct reclaim,影响系统性能。
在内存分配过程中,剩余内存容量处于 WMARK_MIN 与 WMARK_LOW 水位线之间会唤醒 kswapd 进程来回收内存,直到内存容量恢复到 WMARK_HIGH 水位线之上。
剩余内存容量低于 WMARK_MIN 水位线时就会触发直接内存回收 direct reclaim。
而剩余内存容量高于 WMARK_LOW 水位线又不会唤醒 kswapd 进程,因此 kswapd 进程活动的关键范围在 WMARK_MIN 与 WMARK_LOW 之间,而为了应对这种突发的网络流量暴增,我们需要保证 kswapd 进程活动的范围大一些,这样内核就能够时刻进行内存回收使得剩余内存容量较长时间的保持在 WMARK_HIGH 水位线之上。
这样一来就要求 WMARK_MIN 与 WMARK_LOW 水位线之间的间距不能太小,因为 WMARK_LOW 水位线之上就不会唤醒 kswapd 进程了。
因此内核引入了 /proc/sys/vm/watermark_scale_factor 参数来调节水位线之间的间距。该内核参数默认值为 10,最大值为 3000。
-
4.3 retry
内核也不会直接开始 OOM,而是进入到重试流程,在重试流程开始之前内核需要调用 should_reclaim_retry 判断是否应该进行重试,重试标准:
如果内核已经重试了 MAX_RECLAIM_RETRIES (16) 次仍然失败,则放弃重试执行后续 OOM。
如果内核将所有可选内存区域中的所有可回收页面全部回收之后,仍然无法满足内存的分配,那么放弃重试执行后续 OOM。
如果 should_reclaim_retry = false,后面会进一步判断是否应该进行 direct_compact 的重试。
- (图片清晰版)细节拉满,80 张图带你一步一步推演 slab 内存池的设计与实现
- 从内核源码看 slab 内存池的创建初始化流程
- 深入理解 slab cache 内存分配全链路实现
- 深度解析 slab 内存池回收内存以及销毁全流程
- 深度解读 Linux 内核级通用内存池 —— kmalloc 体系
- 一步一图带你构建 Linux 页表体系 —— 详解虚拟内存如何与物理内存进行映射
- 从内核世界透视 mmap 内存映射的本质(原理篇)
- 从内核世界透视 mmap 内存映射的本质(源码实现篇)
- 一文聊透 Linux 缺页异常的处理 —— 图解 Page Faults
- linux内核-内存统计
- linux内存浅析3-内存的监控
- Linux内存管理中锁使用分析及典型优化案例总结
- 浅析NUMA架构:从硬件拓扑到Linux node管理
分段与分页¶
- Linux进程地址空间和进程的内存分布
- 写给新手的MMU工作原理
- 详细讲解MMU——为什么嵌入式linux没他不行?
- 存储管理 -地址空间与重定位
- 逻辑地址、线性地址、物理地址区别
- Linux内存管理中,代码段和数据段映射成的线性地址重叠,不会冲突吗?
- x86段寄存器和分段机制
- 第九回 | Intel 内存管理两板斧:分段与分页
- 从根儿上理解linux虚拟内存
- Linux内存管理
缺页异常¶
内核空间¶
- Linux用户空间与内核空间(理解高端内存)
- 内核空间:kmalloc vmalloc 用户空间:malloc ptmalloc
- 详解Linux内存管理之vmalloc原理及源码实现
- 万字整理,肝翻Linux内存管理所有知识点
- 五万字|深入理解Linux内存管理
- Linux 实现原理 — 内存分配算法
- CMA技术原理分析
- 面试题:Linux是如何避免内存碎片的
- Linux内存管理:NUMA技术详解(非一致内存访问架构)
- Linux中内存管理详解
PERCPU变量分配器¶
slub¶
overcommit_memory¶
输出的值有以下几种可能:
- 0 – Heuristic overcommit handling. 这是缺省值,它允许overcommit,但过于明目张胆的overcommit会被拒绝,比如malloc一次性申请的内存大小就超过了系统总内存。Heuristic的意思是“试 探式的”,内核利用某种算法猜测你的内存申请是否合理,它认为不合理就会拒绝overcommit。
- 1 – Always overcommit. 允许overcommit,对内存申请来者不拒。内核执行无内存过量使用处理。使用这个设置会增大内存超载的可能性,但也可以增强大量使用内存任务的性能。
- 2 – Don’t overcommit. 禁止overcommit。 内存拒绝等于或者大于总可用 swap 大小以及overcommit_ratio 指定的物理 RAM 比例的内存请求。如果希望减小内存过度使用的风险,这个设置就是最好的。
overcommit_ratio¶
overcommit_ratio、overcommit_kbytes在overcommit_memory=2时才有用,overcommit_kbytes非0时,overcommit_ratio无效。
overcommit_kbytes¶
当前机器overcommit_memory配置¶
cat /proc/sys/vm/overcommit_memory
0
cat /proc/sys/vm/overcommit_ratio
50
cat /proc/sys/vm/overcommit_kbytes
0
单次申请的内存大小不能超过以下值,否则本次申请就会失败。
free memory + free swap + pagecache的大小 + SLAB
Linux对大部分申请内存的请求都回复"yes",以便能跑更多更大的程序。因为申请内存后,并不会马上使用内存。这种技术叫做Overcommit。当linux发现内存不足时,会发生OOM killer(OOM=out-of-memory)。它会选择杀死一些进程(用户态进程,不是内核线程),以便释放内存。
当oom-killer发生时,linux会选择杀死哪些进程?选择进程的函数是oom_badness函数(在mm/oom_kill.c中),该函数会计算每个进程的点数(0~1000)。点数越高,这个进程越有可能被杀死。每个进程的点数跟oom_score_adj有关,而且oom_score_adj可以被设置(-1000最低,1000最高)。
OOM¶
min_free_kbytes¶
当剩余内存容量低于 WMARK_MIN 会触发下方操作:
- 直接内存回收
- 直接内存规整
- 产生 OOM
内核也不会直接开始 OOM,而是进入到重试流程,在重试流程开始之前内核需要调用 should_reclaim_retry 判断是否应该进行重试,重试标准:
如果内核已经重试了 MAX_RECLAIM_RETRIES (16) 次仍然失败,则放弃重试执行后续 OOM。
如果内核将所有可选内存区域中的所有可回收页面全部回收之后,仍然无法满足内存的分配,那么放弃重试执行后续 OOM。
当前free memory等于112.5MB,在 WMARK_MIN 之上,故不会触发 OOM。
- 深入了解Linux OOM Killer:一次可怕的内核事件
- 细说|Linux Out Of Memory机制
- Linux进程被杀掉(OOM killer),查看系统日志
- Linux OOM是怎么回事儿
- 记一次解决OTA死机重启bug,如何分析与解决措施?!
- 99%的人都不知道内存充足情况下也会触发OOM!
用户空间¶
mmap¶
malloc¶
- 「linux」如何实现一个malloc
- glibc malloc源码分析
- glibc内存管理那些事儿
- 腾讯一面:malloc是如何分配内存的,free怎么知道该释放多少内存?
- 比 tcmalloc 快 43%!mimalloc 高性能背后的 5 大核心技术揭秘
Buffer、Cache¶
内存池¶
内存检测¶
内核虚拟空间¶
- 如何诊断SLUB问题【转】
- 内存泄漏定位思路和方法
- [内存泄漏]kmalloc-128 slab内存泄漏定位过程
- Linux内存管理 (22)内存检测技术(slub_debug/kmemleak/kasan)
Linux常见的内存访问错误有:
- 越界访问(out of bounds)
- 访问已经释放的内存(use after free)
- 重复释放
- 内存泄露(memory leak)
- 栈溢出(stack overflow)
不同的工具有不同的侧重点,本章主要从slub_debug、kmemleak、kasan三个工具介绍。
kmemleak侧重于内存泄露问题发现。
slub_debug和kasan有一定的重复,部分slub_debug问题需要借助slabinfo去发现;kasan更快,所有问题独立上报,缺点是需要高版本GCC支持(gcc 4.9.2 or gcc 5.0)。
测试环境准备¶
更新内核版本到Kernel v4.4,然后编译:
git clone https://github.com/arnoldlu/linux.git -b running_kernel_4.4
export ARCH=arm64
export CROSS_COMPILE=aarch64-linux-gnu-
make defconfig
make bzImage -j4 ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-
slub_debug¶
关键词:Red Zone、Padding、Object Layout。
Linux内核中,小块内存大量使用slab/slub分配器,slub_debug提供了内存检测小功能。
内存中比较容易出错的地方有:
- 访问已经释放的内存
- 越界访问
- 重复释放内存
关于slub_debug的两篇文章:《图解slub》《SLUB DEBUG原理》
编译支持slub_debug内核¶
首先需要打开General setup -> Enable SLUB debugging support,然后再选择Kernel hacking -> Memory Debugging -> SLUB debugging on by default。
测试环境:slabinfo、slub.ko¶
通过slub.ko模拟内存异常访问,有些可以直接显示,有些需要通过slabinfo -v来查看。
在tools/vm目录下,执行如下命令,生成可执行文件slabinfo。放入_install目录,打包到zImage中。
将编译好的slabinfo放入sbin。
下面三个测试代码:https://github.com/arnoldlu/linux/tree/running_kernel_4.4/test_code/slub_debug
在test_code/slub_debug目录下执行make.sh,将slub.ko/slub2.ko/slub3.ko放入data。
进行测试¶
启动QEMU:
qemu-system-aarch64 -machine virt -cpu cortex-a57 -machine type=virt -smp 2 -m 2048 -kernel arch/arm64/boot/Image --append "rdinit=/linuxrc console=ttyAMA0 loglevel=8 slub_debug=UFPZ" -nographic
F:在free的时候会执行检查。
Z:表示Red Zone的意思。
P:是Poison的意思。
U:会记录slab的使用者信息,如果打开,会会显示分配释放对象的栈回溯。
在slub_debug打开SLAB_STORE_USER选项后,可以清晰地看到问题点的backtrace。
测试结果¶
内存越界访问包括Redzone overwritten和Object padding overwritten。
重复释放对应Object already free。访问已释放内存为Posion overwritten。
Redzone overwritten¶
执行insmod data/slub.ko,使用slabinfo -v查看结果。
static void create_slub_error(void)
{
buf = kmalloc(32, GFP_KERNEL);
if(buf) {
memset(buf, 0x55, 80);-----------------------------------虽然分配32字节,但是对应分配了64字节。所以设置为80字节访问触发异常。从buf开始的80个字节仍然被初始化成功。
}
}
虽然kmalloc申请了32字节的slab缓冲区,但是内核分配的是kmalloc-64。所以memset 36字节不会报错,将36改成大于64即可。
一个slub Debug输出包括四大部分:
=============================================================================
BUG kmalloc-64 (Tainted: G O ): Redzone overwritten-------------------------------------------------------------1. 问题描述:slab名称-kmalloc-64,什么错误-Redzone overwritten。
-----------------------------------------------------------------------------
Disabling lock debugging due to kernel taint
INFO: 0xeddb3640-0xeddb3643. First byte 0x55 instead of 0xcc------------------------------------------------1.1 问题起始和结束地址,这里一共4字节。
INFO: Allocated in 0x55555555 age=1766 cpu=0 pid=771---------------------------------------------------------1.2 slab的分配栈回溯
0x55555555
0xbf002014
do_one_initcall+0x90/0x1d8
do_init_module+0x60/0x38c
load_module+0x1bac/0x1e94
SyS_init_module+0x14c/0x15c
ret_fast_syscall+0x0/0x3c
INFO: Freed in do_one_initcall+0x78/0x1d8 age=1766 cpu=0 pid=771-----------------------------------------1.3 slab的释放栈回溯
do_one_initcall+0x78/0x1d8
do_init_module+0x60/0x38c
load_module+0x1bac/0x1e94
SyS_init_module+0x14c/0x15c
ret_fast_syscall+0x0/0x3c
INFO: Slab 0xefdb5660 objects=16 used=14 fp=0xeddb3700 flags=0x0081-----------------------------------1.4 slab的地址,以及其它信息。
INFO: Object 0xeddb3600 @offset=1536 fp=0x55555555-----------------------------------------------------------1.5 当前Object起始,及相关信息
Bytes b4 eddb35f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ------------2. 问题slab对象内容。2.1 打印问题slab对象内容之前一些字节。
Object eddb3600: 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 UUUUUUUUUUUUUUUU---------2.2 slab对象内容,全部为0x55。
Object eddb3610: 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 UUUUUUUUUUUUUUUU
Object eddb3620: 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 UUUUUUUUUUUUUUUU
Object eddb3630: 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 UUUUUUUUUUUUUUUU
Redzone eddb3640: 55 55 55 55 UUUU----------------------------------------------------------------------------------2.3 Redzone内容,问题出在这里。
Padding eddb36e8: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ------------2.4 Padding内容,为了对象对齐而补充。
Padding eddb36f8: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ
CPU: 2 PID: 773 Comm: slabinfo Tainted: G B O 4.4.0+ #93--------------------------------------------------------3. 检查问题点的栈打印,这里是由于slabinfo找出来的。
Hardware name: ARM-Versatile Express
[<c0016588>] (unwind_backtrace) from [<c0013070>] (show_stack+0x10/0x14)
[<c0013070>] (show_stack) from [<c0244130>] (dump_stack+0x78/0x88)
[<c0244130>] (dump_stack) from [<c00e1874>] (check_bytes_and_report+0xd0/0x10c)
[<c00e1874>] (check_bytes_and_report) from [<c00e1a14>] (check_object+0x164/0x234)
[<c00e1a14>] (check_object) from [<c00e29bc>] (validate_slab_slab+0x198/0x1bc)
[<c00e29bc>] (validate_slab_slab) from [<c00e578c>] (validate_store+0xac/0x190)
[<c00e578c>] (validate_store) from [<c0146780>] (kernfs_fop_write+0xb8/0x1b4)
[<c0146780>] (kernfs_fop_write) from [<c00ebfc4>] (__vfs_write+0x1c/0xd8)
[<c00ebfc4>] (__vfs_write) from [<c00ec808>] (vfs_write+0x90/0x170)
[<c00ec808>] (vfs_write) from [<c00ed008>] (SyS_write+0x3c/0x90)
[<c00ed008>] (SyS_write) from [<c000f3c0>] (ret_fast_syscall+0x0/0x3c)
FIX kmalloc-64: Restoring 0xeddb3640-0xeddb3643=0xcc----------------------------------------------------------4. 问题点是如何被解决的,此处恢复4个字节为0xcc。
Object padding overwritten¶
void create_slub_error(void)
{
int i;
buf = kmalloc(32, GFP_KERNEL);
if(buf) {
buf[-1] = 0x55;------------------------------------------------------------------------向左越界访问
kfree(buf);
}
}
执行insmod data/slub4.ko,结果如下。
这里的越界访问和之前有点不一样的是,这里向左越界。覆盖到了Padding区域。
al: slub error test init
=============================================================================
BUG kmalloc-128 (Tainted: G O ): Object padding overwritten------------------------------------------------------覆盖到Padding区域
-----------------------------------------------------------------------------
Disabling lock debugging due to kernel taint
INFO: 0xffff80007767e9ff-0xffff80007767e9ff. First byte 0x55 instead of 0x5a
INFO: Allocated in call_usermodehelper_setup+0x44/0xb8 age=1 cpu=1 pid=789
alloc_debug_processing+0x17c/0x188
___slab_alloc.constprop.30+0x3f8/0x440
__slab_alloc.isra.27.constprop.29+0x24/0x38
kmem_cache_alloc+0x1ec/0x260
call_usermodehelper_setup+0x44/0xb8
/ # kobject_uevent_env+0x494/0x500
kobject_uevent+0x10/0x18
load_module+0x18cc/0x1d78
SyS_init_module+0x150/0x178
el0_svc_naked+0x24/0x28
INFO: Slab 0xffff7bffc2dd9f80 objects=16 used=9 fp=0xffff80007767ea00 flags=0x4081
INFO: Object 0xffff80007767e800 @offset=2048 fp=0xffff80007767ea00
Bytes b4 ffff80007767e7f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
Object ffff80007767e800: 00 01 00 00 00 00 00 00 08 e8 67 77 00 80 ff ff ..........gw....
Object ffff80007767e810: 08 e8 67 77 00 80 ff ff f8 83 0c 00 00 80 ff ff ..gw............
Object ffff80007767e820: 00 00 00 00 00 00 00 00 00 6e aa 00 00 80 ff ff .........n......
Object ffff80007767e830: 00 23 67 78 00 80 ff ff 18 23 67 78 00 80 ff ff .#gx.....#gx....
Object ffff80007767e840: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Object ffff80007767e850: b8 8e 32 00 00 80 ff ff 00 23 67 78 00 80 ff ff ..2......#gx....
Object ffff80007767e860: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Object ffff80007767e870: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Redzone ffff80007767e880: cc cc cc cc cc cc cc cc ........
Padding ffff80007767e9c0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
Padding ffff80007767e9d0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
Padding ffff80007767e9e0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
Padding ffff80007767e9f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 55 ZZZZZZZZZZZZZZZU
CPU: 0 PID: 790 Comm: mdev Tainted: G B O 4.4.0+ #116
Hardware name: linux,dummy-virt (DT)
Call trace:
[<ffff800000089738>] dump_backtrace+0x0/0x108
[<ffff800000089854>] show_stack+0x14/0x20
[<ffff8000003253c4>] dump_stack+0x94/0xd0
[<ffff800000196460>] print_trailer+0x128/0x1b8
[<ffff800000196848>] check_bytes_and_report+0xd8/0x118
[<ffff800000196928>] check_object+0xa0/0x240
[<ffff8000001987e0>] free_debug_processing+0x128/0x380
[<ffff80000019a1cc>] __slab_free+0x344/0x4a0
[<ffff80000019ab94>] kfree+0x1ec/0x220
[<ffff8000000c8278>] umh_complete+0x58/0x68
[<ffff8000000c83d8>] call_usermodehelper_exec_async+0x150/0x170
[<ffff800000085c50>] ret_from_fork+0x10/0x40
FIX kmalloc-128: Restoring 0xffff80007767e9ff-0xffff80007767e9ff=0x5a---------------------------------------------------------问题处理是将对应字节恢复为0x5a。
Object already free¶
void create_slub_error(void)
{
buf = kmalloc(32, GFP_KERNEL);
if(buf) {
memset(buf, 0x55, 32);
kfree(buf);
printk("al: Object already freed");
kfree(buf);
}
}
内核中free执行流程如下:
执行insmod data/slub2.ko,结果如下。
重复释放,是对同一个对象连续释放了多次。
al: slub error test init
al: Object already freed
=============================================================================
BUG kmalloc-128 (Tainted: G B O ): Object already free------------------------------------------------------------------在64位系统,32字节的kmalloc变成了kmalloc-128,问题类型是:Object already free,也即重复释放。
-----------------------------------------------------------------------------
INFO: Allocated in create_slub_error+0x20/0x80 [slub2] age=0 cpu=1 pid=791------------------------------------内存分配点栈回溯
alloc_debug_processing+0x17c/0x188
___slab_alloc.constprop.30+0x3f8/0x440
__slab_alloc.isra.27.constprop.29+0x24/0x38
kmem_cache_alloc+0x1ec/0x260
create_slub_error+0x20/0x80 [slub2]
my_test_init+0x14/0x28 [slub2]
do_one_initcall+0x90/0x1a0
do_init_module+0x60/0x1cc
load_module+0x18dc/0x1d78
SyS_init_module+0x150/0x178
el0_svc_naked+0x24/0x28
INFO: Freed in create_slub_error+0x50/0x80 [slub2] age=0 cpu=1 pid=791------------------------------------------内存释放点栈回溯
free_debug_processing+0x17c/0x380
__slab_free+0x344/0x4a0
kfree+0x1ec/0x220
create_slub_error+0x50/0x80 [slub2]
my_test_init+0x14/0x28 [slub2]
do_one_initcall+0x90/0x1a0
do_init_module+0x60/0x1cc
load_module+0x18dc/0x1d78
SyS_init_module+0x150/0x178
el0_svc_naked+0x24/0x28
INFO: Slab 0xffff7bffc2dda800 objects=16 used=7 fp=0xffff8000776a0800 flags=0x4081
INFO: Object 0xffff8000776a0800 @offset=2048 fp=0xffff8000776a0a00
Bytes b4 ffff8000776a07f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
Object ffff8000776a0800: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk-----------------内存内容打印,供128字节。
Object ffff8000776a0810: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff8000776a0820: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff8000776a0830: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff8000776a0840: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff8000776a0850: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff8000776a0860: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff8000776a0870: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk.
Redzone ffff8000776a0880: bb bb bb bb bb bb bb bb ........
Padding ffff8000776a09c0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
Padding ffff8000776a09d0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
Padding ffff8000776a09e0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
Padding ffff8000776a09f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
CPU: 1 PID: 791 Comm: insmod Tainted: G B O 4.4.0+ #116--------------------------------------------------------------此处问题在insmod就发现了,所以检查出问题的进程就是insmod。
Hardware name: linux,dummy-virt (DT)
Call trace:
[<ffff800000089738>] dump_backtrace+0x0/0x108
[<ffff800000089854>] show_stack+0x14/0x20
[<ffff8000003253c4>] dump_stack+0x94/0xd0
[<ffff800000196460>] print_trailer+0x128/0x1b8
[<ffff800000198954>] free_debug_processing+0x29c/0x380
[<ffff80000019a1cc>] __slab_free+0x344/0x4a0
[<ffff80000019ab94>] kfree+0x1ec/0x220
[<ffff7ffffc008060>] create_slub_error+0x60/0x80 [slub2]
[<ffff7ffffc00a014>] my_test_init+0x14/0x28 [slub2]
[<ffff800000082930>] do_one_initcall+0x90/0x1a0
[<ffff80000014647c>] do_init_module+0x60/0x1cc
[<ffff800000120704>] load_module+0x18dc/0x1d78
[<ffff800000120cf0>] SyS_init_module+0x150/0x178
[<ffff800000085cb0>] el0_svc_naked+0x24/0x28
FIX kmalloc-128: Object at 0xffff8000776a0800 not freed------------------------------------------------------------------处理的结果是,此处slab 对象是没有被释放。
Poison overwritten¶
static void create_slub_error(void)
{
buf = kmalloc(32, GFP_KERNEL);-----------------------此时的buf内容都是0x6B
if(buf) {
kfree(buf);
printk("al: Access after free");
memset(buf, 0x55, 32);-----------------------------虽然被释放,但是memset仍然生效了变成了0x55。
}
}
执行insmod data/slub3.ko ,使用slabinfo -v查看结果。
=============================================================================
BUG kmalloc-128 (Tainted: G B O ): Poison overwritten----------------------------------------------slab名称为kmalloc-64,问题类型是:Poison overwritten,即访问已释放内存。
-----------------------------------------------------------------------------
INFO: 0xffff800077692800-0xffff80007769281f. First byte 0x55 instead of 0x6b
INFO: Allocated in create_slub_error+0x28/0xf0 [slub3] age=1089 cpu=1 pid=793----------分配点的栈回溯
alloc_debug_processing+0x17c/0x188
___slab_alloc.constprop.30+0x3f8/0x440
__slab_alloc.isra.27.constprop.29+0x24/0x38
kmem_cache_alloc+0x1ec/0x260
create_slub_error+0x28/0xf0 [slub3]
0xffff7ffffc00e014
do_one_initcall+0x90/0x1a0
do_init_module+0x60/0x1cc
load_module+0x18dc/0x1d78
SyS_init_module+0x150/0x178
el0_svc_naked+0x24/0x28
INFO: Freed in create_slub_error+0x80/0xf0 [slub3] age=1089 cpu=1 pid=793--------------释放点的栈回溯
free_debug_processing+0x17c/0x380
__slab_free+0x344/0x4a0
kfree+0x1ec/0x220
create_slub_error+0x80/0xf0 [slub3]
0xffff7ffffc00e014
do_one_initcall+0x90/0x1a0
do_init_module+0x60/0x1cc
load_module+0x18dc/0x1d78
SyS_init_module+0x150/0x178
el0_svc_naked+0x24/0x28
INFO: Slab 0xffff7bffc2dda480 objects=16 used=16 fp=0x (null) flags=0x4080
INFO: Object 0xffff800077692800 @offset=2048 fp=0xffff800077692400
Bytes b4 ffff8000776927f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
Object ffff800077692800: 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 UUUUUUUUUUUUUUUU--------前32字节仍然被修改成功。
Object ffff800077692810: 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 UUUUUUUUUUUUUUUU
Object ffff800077692820: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff800077692830: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff800077692840: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff800077692850: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff800077692860: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object ffff800077692870: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk.
Redzone ffff800077692880: bb bb bb bb bb bb bb bb ........
Padding ffff8000776929c0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
Padding ffff8000776929d0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
Padding ffff8000776929e0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
Padding ffff8000776929f0: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
CPU: 0 PID: 795 Comm: slabinfo Tainted: G B O 4.4.0+ #116
Hardware name: linux,dummy-virt (DT)
Call trace:
[<ffff800000089738>] dump_backtrace+0x0/0x108
[<ffff800000089854>] show_stack+0x14/0x20
[<ffff8000003253c4>] dump_stack+0x94/0xd0
[<ffff800000196460>] print_trailer+0x128/0x1b8
[<ffff800000196848>] check_bytes_and_report+0xd8/0x118
[<ffff800000196a54>] check_object+0x1cc/0x240
[<ffff800000197920>] alloc_debug_processing+0x108/0x188
[<ffff800000199670>] ___slab_alloc.constprop.30+0x3f8/0x440
[<ffff8000001996dc>] __slab_alloc.isra.27.constprop.29+0x24/0x38
[<ffff8000001998dc>] kmem_cache_alloc+0x1ec/0x260
[<ffff8000001d42fc>] seq_open+0x34/0x90
[<ffff80000022059c>] kernfs_fop_open+0x194/0x370
[<ffff8000001afb04>] do_dentry_open+0x214/0x318
[<ffff8000001b0dc8>] vfs_open+0x58/0x68
[<ffff8000001bf338>] path_openat+0x460/0xdf0
[<ffff8000001c0ff0>] do_filp_open+0x60/0xe0
[<ffff8000001b117c>] do_sys_open+0x12c/0x218
[<ffff8000001fd53c>] compat_SyS_open+0x1c/0x28
[<ffff800000085cb0>] el0_svc_naked+0x24/0x28
FIX kmalloc-128: Restoring 0xffff800077692800-0xffff80007769281f=0x6b
FIX kmalloc-128: Marking all objects used
SLUB: kmalloc-128 210 slabs counted but counter=211
slabinfo (795) used greatest stack depth: 12976 bytes left
slub-memkeak¶
- slab申请释放和追踪
- 内存泄露调试分析(一)
- 【内存管理】【slab】/sys/kernel/slab/<slab name>/trace解析
- SLAB内存泄露分析实践
- 发生 SLAB 内存泄漏该怎么办
cat free_calls
117 <not-available> age=4294919470 pid=0 cpus=0
56 amdgpu_fence_free+0x30/0x38 [amdgpu] age=169/10203/25074 pid=0-1419 cpus=0-2
vim /var/log/kern.log
2024-01-10 13:39:15 wujing-PC kernel: [ 166.228864] Hardware name: BXC BM6J80/TBD, BIOS KL4.23.TF.N.021.210510.R 05/10/21 15:33:37
2024-01-10 13:39:15 wujing-PC kernel: [ 166.237115] Call trace:
2024-01-10 13:39:15 wujing-PC kernel: [ 166.239550] dump_backtrace+0x0/0x190
2024-01-10 13:39:15 wujing-PC kernel: [ 166.243200] show_stack+0x14/0x20
2024-01-10 13:39:15 wujing-PC kernel: [ 166.246503] dump_stack+0xa8/0xcc
2024-01-10 13:39:15 wujing-PC kernel: [ 166.249806] free_debug_processing+0x19c/0x3a0
2024-01-10 13:39:15 wujing-PC kernel: [ 166.254237] __slab_free+0x230/0x3f8
2024-01-10 13:39:15 wujing-PC kernel: [ 166.257801] kmem_cache_free+0x200/0x220
2024-01-10 13:39:15 wujing-PC kernel: [ 166.261801] amdgpu_fence_free+0x30/0x38 [amdgpu]
2024-01-10 13:39:15 wujing-PC kernel: [ 166.266493] rcu_process_callbacks+0x2d8/0x500
2024-01-10 13:39:15 wujing-PC kernel: [ 166.270925] __do_softirq+0x110/0x2e8
2024-01-10 13:39:15 wujing-PC kernel: [ 166.274576] irq_exit+0x9c/0xb8
2024-01-10 13:39:15 wujing-PC kernel: [ 166.277706] __handle_domain_irq+0x64/0xb8
2024-01-10 13:39:15 wujing-PC kernel: [ 166.281790] gic_handle_irq+0x7c/0x178
2024-01-10 13:39:15 wujing-PC kernel: [ 166.285526] el1_irq+0xb0/0x140
2024-01-10 13:39:15 wujing-PC kernel: [ 166.288656] arch_cpu_idle+0x2c/0x1b0
2024-01-10 13:39:15 wujing-PC kernel: [ 166.292305] do_idle+0x1d4/0x238
2024-01-10 13:39:15 wujing-PC kernel: [ 166.295521] cpu_startup_entry+0x24/0x28
2024-01-10 13:39:15 wujing-PC kernel: [ 166.299432] secondary_start_kernel+0x158/0x168
2024-01-10 13:39:15 wujing-PC kernel: [ 166.304153] TRACE amdgpu_fence free 0x000000004f170f7a inuse=6 fp=0x00000000ef60c3a4
2024-01-10 13:39:15 wujing-PC kernel: [ 166.311886] Object 000000004f170f7a: 00 00 00 00 6b 6b 6b 6b 38 b9 e2 00 00 00 ff ff ....kkkk8.......
2024-01-10 13:39:15 wujing-PC kernel: [ 166.321180] Object 00000000de513b7f: b0 7c d3 7e 20 80 ff ff f8 41 cc 00 00 00 ff ff .|.~ ....A......
2024-01-10 13:39:15 wujing-PC kernel: [ 166.330473] Object 00000000dca91573: 60 aa db 87 20 80 ff ff 60 aa db 87 20 80 ff ff `... ...`... ...
2024-01-10 13:39:15 wujing-PC kernel: [ 166.339767] Object 00000000cc2e4487: 04 4d 68 dc 20 80 ff ff 00 00 00 00 00 00 00 00 .Mh. ...........
2024-01-10 13:39:15 wujing-PC kernel: [ 166.349060] Object 00000000befbbd4b: 89 02 00 00 6b 6b 6b 6b 07 00 00 00 00 00 00 00 ....kkkk........
2024-01-10 13:39:15 wujing-PC kernel: [ 166.358353] Object 00000000a4cce3a9: d2 8a c4 b0 24 00 00 00 00 00 00 00 6b 6b 6b 6b ....$.......kkkk
2024-01-10 13:39:15 wujing-PC kernel: [ 166.367646] Object 00000000e8ae45cf: 98 4c 68 dc 20 80 ff ff .Lh. ...
2024-01-10 13:39:15 wujing-PC kernel: [ 166.376247] CPU: 2 PID: 21 Comm: ksoftirqd/2 Kdump: loaded Tainted: G O 4.19.0-arm64-desktop-tyy-5819-ext4-slub-debug-kmemleak #5819
2024-01-10 13:39:15 wujing-PC kernel: [ 166.389448] Hardware name: BXC BM6J80/TBD, BIOS KL4.23.TF.N.021.210510.R 05/10/21 15:33:37
2024-01-10 13:39:15 wujing-PC kernel: [ 166.397699] Call trace:
2024-01-10 13:39:15 wujing-PC kernel: [ 166.400136] dump_backtrace+0x0/0x190
2024-01-10 13:39:15 wujing-PC kernel: [ 166.403786] show_stack+0x14/0x20
2024-01-10 13:39:15 wujing-PC kernel: [ 166.407089] dump_stack+0xa8/0xcc
2024-01-10 13:39:15 wujing-PC kernel: [ 166.410393] free_debug_processing+0x19c/0x3a0
2024-01-10 13:39:15 wujing-PC kernel: [ 166.414824] __slab_free+0x230/0x3f8
2024-01-10 13:39:15 wujing-PC kernel: [ 166.418387] kmem_cache_free+0x200/0x220
2024-01-10 13:39:15 wujing-PC kernel: [ 166.422410] amdgpu_fence_free+0x30/0x38 [amdgpu]
2024-01-10 13:39:15 wujing-PC kernel: [ 166.427103] rcu_process_callbacks+0x2d8/0x500
2024-01-10 13:39:15 wujing-PC kernel: [ 166.431535] __do_softirq+0x110/0x2e8
2024-01-10 13:39:15 wujing-PC kernel: [ 166.435186] run_ksoftirqd+0x30/0x40
2024-01-10 13:39:15 wujing-PC kernel: [ 166.438751] smpboot_thread_fn+0x160/0x1a8
2024-01-10 13:39:15 wujing-PC kernel: [ 166.442835] kthread+0x128/0x130
2024-01-10 13:39:15 wujing-PC kernel: [ 166.446052] ret_from_fork+0x10/0x18
kmemleak¶
kmemleak是内核提供的一种检测内存泄露工具,启动一个内核线程扫描内存,并打印发现新的未引用对象数量。
支持kmemleak内核选项¶
要使用kmemlieak,需要打开如下内核选项。
Kernel hacking->Memory Debugging->Kernel memory leak detector:
CONFIG_HAVE_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=400
# CONFIG_DEBUG_KMEMLEAK_TEST is not set
CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=y---------或者关闭此选项,则不需要在命令行添加kmemleak=on。
构造测试环境¶
同时还需要在内核启动命令行中添加kmemleak=on。
qemu-system-aarch64 -machine virt -cpu cortex-a57 -machine type=virt -smp 2 -m 2048 -kernel arch/arm64/boot/Image --append "rdinit=/linuxrc console=ttyAMA0 loglevel=8 kmemleak=on" -nographic
测试代码如下:
static char *buf;
void create_kmemleak(void)
{
buf = kmalloc(120, GFP_KERNEL);
buf = vmalloc(4096);
}
进行测试¶
进行kmemleak测试之前,需要写入scan触发扫描操作。
然后通过读kmemlean节点读取相关信息。
- 打开kmemlean扫描功能:
echo scan > /sys/kernel/debug/kmemleak - 加载问题module:
insmod data/kmemleak.ko - 等待问题发现:kmemleak: 2 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
- 查看kmemleak结果:
cat /sys/kernel/debug/kmemleak
分析测试结果¶
每处泄露,都标出泄露地址和大小;相关进程信息;内存内容dump;栈回溯。
kmemleak会提示内存泄露可疑对象的具体栈调用信息、可疑对象的大小、使用哪个函数分配、二进制打印。
unreferenced object 0xede22dc0 (size 128):-------------------------------------第一处可疑泄露128字节
comm "insmod", pid 765, jiffies 4294941257 (age 104.920s)--------------------相关进程信息
hex dump (first 32 bytes):---------------------------------------------------二进制打印
6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
backtrace:-------------------------------------------------------------------栈回溯
[<bf002014>] 0xbf002014
[<c000973c>] do_one_initcall+0x90/0x1d8
[<c00a71f4>] do_init_module+0x60/0x38c
[<c0086898>] load_module+0x1bac/0x1e94
[<c0086ccc>] SyS_init_module+0x14c/0x15c
[<c000f3c0>] ret_fast_syscall+0x0/0x3c
[<ffffffff>] 0xffffffff
unreferenced object 0xf12ba000 (size 4096):
comm "insmod", pid 765, jiffies 4294941257 (age 104.920s)
hex dump (first 32 bytes):
d8 21 00 00 02 18 00 00 e4 21 00 00 02 18 00 00 .!.......!......
46 22 00 00 02 18 00 00 52 22 00 00 02 18 00 00 F"......R"......
backtrace:
[<c00d77c8>] vmalloc+0x2c/0x34
[<bf002014>] 0xbf002014
[<c000973c>] do_one_initcall+0x90/0x1d8
[<c00a71f4>] do_init_module+0x60/0x38c
[<c0086898>] load_module+0x1bac/0x1e94
[<c0086ccc>] SyS_init_module+0x14c/0x15c
[<c000f3c0>] ret_fast_syscall+0x0/0x3c
[<ffffffff>] 0xffffffff
kasan¶
相关文档阅读:《Kasan - Linux 内核的内存检测工具》《KASAN实现原理》。
kasan暂不支持32位ARM,支持ARM64和X86。
kasan是一个动态检查内存错误的工具,可以检查内存越界访问、使用已释放内存、重复释放以及栈溢出。
使能kasan¶
使用kasan,必须打开CONFIG_KASAN。
Kernel hacking->Memory debugging->KASan: runtime memory debugger
CONFIG_SLUB_DEBUG=y # 有些版本要求先开启 SLUB 分配器的调试配置宏 CONFIG_SLUB_DEBUG,才能看到 KASAN 的配置菜单项。最新版本已经不需要了,但是建议开启,因为可以打印更多有用的信息。
CONFIG_HAVE_ARCH_KASAN=y
CONFIG_KASAN=y
# CONFIG_KASAN_OUTLINE is not set
CONFIG_KASAN_INLINE=y
CONFIG_TEST_KASAN=m
CONFIG_STACKTRACE=y # 为了更好地缺陷检查和报告
代码分析¶
kasan_report
->kasan_report_error
->print_error_description
->print_address_description
->print_shadow_for_address
测试用及分析¶
kasan提供了一个测试程序test_kacan.c,将其编译成模块,加载到内核。可以模拟很多内存错误场景。
kasan可以检测到越界访问、访问已释放内存、重复释放等类型错误,其中重复释放借助于slub_debug。
越界访问包括slab越界、栈越界、全局变量越界;访问已释放内存use-after-free;重复释放可以被slub_debug识别。
slab-out-of-bounds¶
static noinline void __init kmalloc_oob_right(void)
{
char *ptr;
size_t size = 123;
pr_info("out-of-bounds to right\n");
ptr = kmalloc(size, GFP_KERNEL);
if (!ptr) {
pr_err("Allocation failed\n");
return;
}
ptr[size] = 'x';
kfree(ptr);
}
此种错误类型是对slab的越界访问,包括左侧、右侧、扩大、缩小后越界访问。除了数组赋值,还包括memset、指针访问等等。
al: kasan error test init
kasan test: kmalloc_oob_right out-of-bounds to right
==================================================================
BUG: KASAN: slab-out-of-bounds in kmalloc_oob_right+0xa4/0xe0 [kasan] at addr ffff800066539c7b----------------错误类型是slab-out-of-bounds,在kmalloc_oob_right中产生。
Write of size 1 by task insmod/788
=============================================================================
BUG kmalloc-128 (Tainted: G O ): kasan: bad access detected-------------------------------------------------------------------slab非法非法访问
-----------------------------------------------------------------------------
Disabling lock debugging due to kernel taint
INFO: Allocated in kmalloc_oob_right+0x54/0xe0 [kasan] age=0 cpu=1 pid=788--------------------------------------------问题点kmalloc_oob_right的栈回溯
alloc_debug_processing+0x17c/0x188
___slab_alloc.constprop.30+0x3f8/0x440
__slab_alloc.isra.27.constprop.29+0x24/0x38
kmem_cache_alloc+0x220/0x280
kmalloc_oob_right+0x54/0xe0 [kasan]
kmalloc_tests_init+0x18/0x70 [kasan]
do_one_initcall+0x11c/0x310
do_init_module+0x1cc/0x588
load_module+0x48cc/0x5dc0
SyS_init_module+0x1a8/0x1e0
el0_svc_naked+0x24/0x28
INFO: Freed in do_one_initcall+0x10c/0x310 age=0 cpu=1 pid=788
free_debug_processing+0x17c/0x368
__slab_free+0x344/0x4a0
kfree+0x21c/0x250
do_one_initcall+0x10c/0x310
do_init_module+0x1cc/0x588
load_module+0x48cc/0x5dc0
SyS_init_module+0x1a8/0x1e0
el0_svc_naked+0x24/0x28
INFO: Slab 0xffff7bffc2994e00 objects=16 used=2 fp=0xffff800066539e00 flags=0x4080
INFO: Object 0xffff800066539c00 @offset=7168 fp=0xffff800066538200
Bytes b4 ffff800066539bf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................------------------------------内存dump
Object ffff800066539c00: 00 82 53 66 00 80 ff ff 74 65 73 74 73 5f 69 6e ..Sf....tests_in
Object ffff800066539c10: 69 74 20 5b 6b 61 73 61 6e 5d 00 00 00 00 00 00 it [kasan]......
Object ffff800066539c20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Object ffff800066539c30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Object ffff800066539c40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Object ffff800066539c50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Object ffff800066539c60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Object ffff800066539c70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Padding ffff800066539db0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Padding ffff800066539dc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Padding ffff800066539dd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Padding ffff800066539de0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Padding ffff800066539df0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
CPU: 1 PID: 788 Comm: insmod Tainted: G B O 4.4.0+ #108------------------------------------------------------------------打印此log消息的栈回溯
Hardware name: linux,dummy-virt (DT)
Call trace:
[<ffff80000008e938>] dump_backtrace+0x0/0x270
[<ffff80000008ebbc>] show_stack+0x14/0x20
[<ffff800000735bb0>] dump_stack+0x100/0x188
[<ffff800000318f60>] print_trailer+0xf8/0x160
[<ffff80000031ea8c>] object_err+0x3c/0x50
[<ffff8000003209a0>] kasan_report_error+0x240/0x558
[<ffff800000320e90>] __asan_report_store1_noabort+0x48/0x50
[<ffff7ffffc008324>] kmalloc_oob_right+0xa4/0xe0 [kasan]
[<ffff7ffffc009070>] kmalloc_tests_init+0x18/0x70 [kasan]
[<ffff80000008309c>] do_one_initcall+0x11c/0x310
[<ffff8000002648c4>] do_init_module+0x1cc/0x588
[<ffff800000206724>] load_module+0x48cc/0x5dc0
[<ffff800000207dc0>] SyS_init_module+0x1a8/0x1e0
[<ffff800000086cb0>] el0_svc_naked+0x24/0x28
Memory state around the buggy address:
ffff800066539b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff800066539b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff800066539c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 03
^
ffff800066539c80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff800066539d00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
user-after-free¶
user-after-free是释放后使用的意思。
static noinline void __init kmalloc_uaf(void)
{
char *ptr;
size_t size = 10;
pr_info("use-after-free\n");
ptr = kmalloc(size, GFP_KERNEL);
if (!ptr) {
pr_err("Allocation failed\n");
return;
}
kfree(ptr);
*(ptr + 8) = 'x';
}
测试结果如下:
kasan test: kmalloc_uaf use-after-free
==================================================================
BUG: KASAN: use-after-free in kmalloc_uaf+0xac/0xe0 [kasan] at addr ffff800066539e08
Write of size 1 by task insmod/788
=============================================================================
BUG kmalloc-128 (Tainted: G B O ): kasan: bad access detected
-----------------------------------------------------------------------------
INFO: Allocated in kmalloc_uaf+0x54/0xe0 [kasan] age=0 cpu=1 pid=788
alloc_debug_processing+0x17c/0x188
___slab_alloc.constprop.30+0x3f8/0x440
__slab_alloc.isra.27.constprop.29+0x24/0x38
kmem_cache_alloc+0x220/0x280
kmalloc_uaf+0x54/0xe0 [kasan]
kmalloc_tests_init+0x48/0x70 [kasan]
do_one_initcall+0x11c/0x310
do_init_module+0x1cc/0x588
load_module+0x48cc/0x5dc0
SyS_init_module+0x1a8/0x1e0
el0_svc_naked+0x24/0x28
INFO: Freed in kmalloc_uaf+0x84/0xe0 [kasan] age=0 cpu=1 pid=788
free_debug_processing+0x17c/0x368
__slab_free+0x344/0x4a0
kfree+0x21c/0x250
kmalloc_uaf+0x84/0xe0 [kasan]
kmalloc_tests_init+0x48/0x70 [kasan]
do_one_initcall+0x11c/0x310
do_init_module+0x1cc/0x588
load_module+0x48cc/0x5dc0
SyS_init_module+0x1a8/0x1e0
el0_svc_naked+0x24/0x28
INFO: Slab 0xffff7bffc2994e00 objects=16 used=1 fp=0xffff800066539e00 flags=0x4080
INFO: Object 0xffff800066539e00 @offset=7680 fp=0xffff800066539800
Bytes b4 ffff800066539df0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Object ffff800066539e00: 00 98 53 66 00 80 ff ff 00 00 00 00 00 00 00 00 ..Sf............
Object ffff800066539e10: 00 9e 53 66 00 80 ff ff d0 51 12 00 00 80 ff ff ..Sf.....Q......
Object ffff800066539e20: 00 00 00 00 00 00 00 00 e0 14 6d 01 00 80 ff ff ..........m.....
Object ffff800066539e30: 00 69 a3 66 00 80 ff ff 18 69 a3 66 00 80 ff ff .i.f.....i.f....
Object ffff800066539e40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Object ffff800066539e50: 30 da 73 00 00 80 ff ff 00 69 a3 66 00 80 ff ff 0.s......i.f....
Object ffff800066539e60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Object ffff800066539e70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Padding ffff800066539fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Padding ffff800066539fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Padding ffff800066539fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Padding ffff800066539fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Padding ffff800066539ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
CPU: 1 PID: 788 Comm: insmod Tainted: G B O 4.4.0+ #108
Hardware name: linux,dummy-virt (DT)
Call trace:
[<ffff80000008e938>] dump_backtrace+0x0/0x270
[<ffff80000008ebbc>] show_stack+0x14/0x20
[<ffff800000735bb0>] dump_stack+0x100/0x188
[<ffff800000318f60>] print_trailer+0xf8/0x160
[<ffff80000031ea8c>] object_err+0x3c/0x50
[<ffff8000003209a0>] kasan_report_error+0x240/0x558
[<ffff800000320e90>] __asan_report_store1_noabort+0x48/0x50
[<ffff7ffffc00874c>] kmalloc_uaf+0xac/0xe0 [kasan]
[<ffff7ffffc0090a0>] kmalloc_tests_init+0x48/0x70 [kasan]
[<ffff80000008309c>] do_one_initcall+0x11c/0x310
[<ffff8000002648c4>] do_init_module+0x1cc/0x588
[<ffff800000206724>] load_module+0x48cc/0x5dc0
[<ffff800000207dc0>] SyS_init_module+0x1a8/0x1e0
[<ffff800000086cb0>] el0_svc_naked+0x24/0x28
Memory state around the buggy address:
ffff800066539d00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff800066539d80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff800066539e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff800066539e80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff800066539f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
stack-out-of-bounds¶
栈越界访问是函数中数组越界,在实际工程中经常出现,问题难以发现。
static noinline void __init kasan_stack_oob(void)
{
char stack_array[10];
volatile int i = 0;
char *p = &stack_array[ARRAY_SIZE(stack_array) + i];
pr_info("out-of-bounds on stack\n");
*(volatile char *)p;
}
kasan test: kasan_stack_oob out-of-bounds on stack
==================================================================
BUG: KASAN: stack-out-of-bounds in kasan_stack_oob+0xa8/0xf0 [kasan] at addr ffff800066acb95a
Read of size 1 by task insmod/788
page:ffff7bffc29ab2c0 count:0 mapcount:0 mapping: (null) index:0x0
flags: 0x0()
page dumped because: kasan: bad access detected
CPU: 1 PID: 788 Comm: insmod Tainted: G B O 4.4.0+ #108
Hardware name: linux,dummy-virt (DT)
Call trace:
[<ffff80000008e938>] dump_backtrace+0x0/0x270
[<ffff80000008ebbc>] show_stack+0x14/0x20
[<ffff800000735bb0>] dump_stack+0x100/0x188
[<ffff800000320c90>] kasan_report_error+0x530/0x558
[<ffff800000320d00>] __asan_report_load1_noabort+0x48/0x50
[<ffff7ffffc0080a8>] kasan_stack_oob+0xa8/0xf0 [kasan]
[<ffff7ffffc0090b0>] kmalloc_tests_init+0x58/0x70 [kasan]
[<ffff80000008309c>] do_one_initcall+0x11c/0x310
[<ffff8000002648c4>] do_init_module+0x1cc/0x588
[<ffff800000206724>] load_module+0x48cc/0x5dc0
[<ffff800000207dc0>] SyS_init_module+0x1a8/0x1e0
[<ffff800000086cb0>] el0_svc_naked+0x24/0x28
Memory state around the buggy address:
ffff800066acb800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff800066acb880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1
>ffff800066acb900: f1 f1 04 f4 f4 f4 f2 f2 f2 f2 00 02 f4 f4 f3 f3
^
ffff800066acb980: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1
ffff800066acba00: f1 f1 00 00 00 00 00 00 00 00 f3 f3 f3 f3 00 00
==================================================================
global-out-of-bounds¶
static char global_array[10];
static noinline void __init kasan_global_oob(void)
{
volatile int i = 3;
char *p = &global_array[ARRAY_SIZE(global_array) + i];
pr_info("out-of-bounds global variable\n");
*(volatile char *)p;
}
测试结果如下:
kasan test: kasan_global_oob out-of-bounds global variable
==================================================================
BUG: KASAN: global-out-of-bounds in kasan_global_oob+0x9c/0xe8 [kasan] at addr ffff7ffffc001c8d
Read of size 1 by task insmod/788
Address belongs to variable global_array+0xd/0xffffffffffffe3f8 [kasan]
CPU: 1 PID: 788 Comm: insmod Tainted: G B O 4.4.0+ #108
Hardware name: linux,dummy-virt (DT)
Call trace:
[<ffff80000008e938>] dump_backtrace+0x0/0x270
[<ffff80000008ebbc>] show_stack+0x14/0x20
[<ffff800000735bb0>] dump_stack+0x100/0x188
[<ffff800000320c90>] kasan_report_error+0x530/0x558
[<ffff800000320d00>] __asan_report_load1_noabort+0x48/0x50
[<ffff7ffffc00818c>] kasan_global_oob+0x9c/0xe8 [kasan]
[<ffff7ffffc0090b4>] kmalloc_tests_init+0x5c/0x70 [kasan]
[<ffff80000008309c>] do_one_initcall+0x11c/0x310
[<ffff8000002648c4>] do_init_module+0x1cc/0x588
[<ffff800000206724>] load_module+0x48cc/0x5dc0
[<ffff800000207dc0>] SyS_init_module+0x1a8/0x1e0
[<ffff800000086cb0>] el0_svc_naked+0x24/0x28
Memory state around the buggy address:
ffff7ffffc001b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff7ffffc001c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff7ffffc001c80: 00 02 fa fa fa fa fa fa 00 00 00 00 00 00 00 00
^
ffff7ffffc001d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff7ffffc001d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================
小结¶
kmemleak检查内存泄露的独门绝技,让其有一定市场空间。但功能比较单一,专注于内存泄露问题。
对于非ARM64/x86平台,只能使用slub_debug进行内存问题分析;kasan更高效,但也需要更高的内核和GCC版本支持。