very high memory usage due to kernfs_node_cache slabs
https://pms.uniontech.com/bug-view-238303.html
bug环境
1 2 uname -a Linux 0000000g-A8cUta6pu5 4.19.0-arm64-desktop-tyy-5819-ext4-slub-debug-kmemleak
1 2 3 4 5 6 7 8 apt policy systemd systemd: 已安装:241.61-deepin1 候选: 241.61-deepin1 版本列表: *** 241.61-deepin1 500 500 http://pools.uniontech.com/ppa/dde-eagle eagle/1070/main arm64 Packages 100 /var/lib/dpkg/status
查看当前用户密码,方便切root:
1 cat /var/log/cloud-init.log | grep ctyun
初步分析
Slab
/proc/meminfo
1 2 3 4 cat /proc/meminfo | grep Slab: -A2 Slab: 913952 kB SReclaimable: 175568 kB SUnreclaim: 738384 kB
slabtop
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 slabtop -s c -o | head -n20 Active / Total Objects (% used) : 924058 / 1039323 (88.9%) Active / Total Slabs (% used) : 43496 / 43496 (100.0%) Active / Total Caches (% used) : 121 / 187 (64.7%) Active / Total Size (% used) : 815879.95K / 908598.60K (89.8%) Minimum / Average / Maximum Object : 0.36K / 0.87K / 16.81K OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 55251 52846 95% 4.44K 7893 7 252576K names_cache 229119 223856 97% 0.48K 6943 33 111088K kernfs_node_cache 141850 137390 96% 0.62K 5674 25 90784K kmalloc-128 13776 12659 91% 4.50K 1968 7 62976K kmalloc-4096 104670 87963 84% 0.53K 3489 30 55824K dentry 63020 52097 82% 0.69K 2740 23 43840K filp 72090 59483 82% 0.53K 2403 30 38448K vm_area_struct 21142 19921 94% 1.41K 961 22 30752K ext4_inode_cache 58555 55044 94% 0.45K 1673 35 26768K buffer_head 38775 34343 88% 0.62K 1551 25 24816K skbuff_head_cache 23256 21205 91% 0.94K 684 34 21888K inode_cache 21248 17845 83% 1.00K 664 32 21248K kmalloc-512 47502 40866 86% 0.41K 1218 39 19488K anon_vma_chain
slub_debug
打开下方slub_debug相关内核编译选项:
1 2 3 4 5 6 7 CONFIG_SLUB=y CONFIG_SLUB_DEBUG=y CONFIG_SLUB_DEBUG_ON=y CONFIG_SLUB_STATS=y
在grub中增加参数:
追踪kernfs_node_cache分配释放:
1 echo 1 > /sys/kernel/slab/kernfs_node_cache/trace && sleep 60 && echo 0 > /sys/kernel/slab/kernfs_node_cache/trace
1 /sys/kernel/slab/kernfs_node_cache/group
在/var/log/kern.log中查看kernfs_node_cache alloc:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 1673 2024-01-24 12:40:55 0000000g-A8cUta6pu5 kernel: [ 201.452633] TRACE kernfs_node_cache alloc 0x00000000a05f4917 inuse=35 fp=0x (null) 1674 2024-01-24 12:40:55 0000000g-A8cUta6pu5 kernel: [ 201.454989] CPU: 7 PID: 1 Comm: systemd Tainted: G O 4.19.0-arm64-desktop-tyy-5819-ext4-slub-debug-kmemleak 1675 2024-01-24 12:40:55 0000000g-A8cUta6pu5 kernel: [ 201.457735] Hardware name: RDO OpenStack Compute, BIOS 0.0.0 02/06/2015 1676 2024-01-24 12:40:55 0000000g-A8cUta6pu5 kernel: [ 201.459811] Call trace: 1677 2024-01-24 12:40:55 0000000g-A8cUta6pu5 kernel: [ 201.460466] dump_backtrace+0x0/0x190 1678 2024-01-24 12:40:55 0000000g-A8cUta6pu5 kernel: [ 201.461368] show_stack+0x14/0x20 1679 2024-01-24 12:40:55 0000000g-A8cUta6pu5 kernel: [ 201.462291] dump_stack+0xa8/0xcc 1680 2024-01-24 12:40:55 0000000g-A8cUta6pu5 kernel: [ 201.463168] alloc_debug_processing+0x58/0x188 1681 2024-01-24 12:40:55 0000000g-A8cUta6pu5 kernel: [ 201.464272] ___slab_alloc.constprop.34+0x31c/0x388 1682 2024-01-24 12:40:55 0000000g-A8cUta6pu5 kernel: [ 201.465491] kmem_cache_alloc+0x210/0x278 1683 2024-01-24 12:40:55 0000000g-A8cUta6pu5 kernel: [ 201.466538] __kernfs_new_node+0x60/0x1f8 1684 2024-01-24 12:40:55 0000000g-A8cUta6pu5 kernel: [ 201.467553] kernfs_new_node+0x24/0x48 1685 2024-01-24 12:40:55 0000000g-A8cUta6pu5 kernel: [ 201.468508] kernfs_create_dir_ns+0x30/0x88 1686 2024-01-24 12:40:55 0000000g-A8cUta6pu5 kernel: [ 201.469584] cgroup_mkdir+0x2f0/0x4e8 1687 2024-01-24 12:40:55 0000000g-A8cUta6pu5 kernel: [ 201.470516] kernfs_iop_mkdir+0x58/0x88 1688 2024-01-24 12:40:55 0000000g-A8cUta6pu5 kernel: [ 201.471497] vfs_mkdir+0xfc/0x1c0 1689 2024-01-24 12:40:55 0000000g-A8cUta6pu5 kernel: [ 201.472328] do_mkdirat+0xec/0x100 1690 2024-01-24 12:40:55 0000000g-A8cUta6pu5 kernel: [ 201.473172] __arm64_sys_mkdirat+0x1c/0x28 1691 2024-01-24 12:40:55 0000000g-A8cUta6pu5 kernel: [ 201.474210] el0_svc_common+0x90/0x178 1692 2024-01-24 12:40:55 0000000g-A8cUta6pu5 kernel: [ 201.475125] el0_svc_handler+0x9c/0xa8 1693 2024-01-24 12:40:55 0000000g-A8cUta6pu5 kernel: [ 201.476049] el0_svc+0x8/0xc
1 2 grep "kernfs_node_cache alloc" kern.log | wc -l 7239
1 2 3 ./scripts/faddr2line vmlinux vfs_mkdir+0xfc/0x1c0 vfs_mkdir+0xfc/0x1c0: vfs_mkdir 于 fs/namei.c:3820
1 2 3 ./scripts/faddr2line vmlinux kernfs_iop_mkdir+0x58/0x88 kernfs_iop_mkdir+0x58/0x88: kernfs_iop_mkdir 于 fs/kernfs/dir.c:1120
1 2 3 4 ./scripts/faddr2line vmlinux cgroup_mkdir+0x2f0/0x4e8 cgroup_mkdir+0x2f0/0x4e8: kernfs_create_dir 于 include/linux/kernfs.h:507 (已内连入)cgroup_mkdir 于 kernel/cgroup/cgroup.c:5032
在/var/log/kern.log中查看kernfs_node_cache free:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 3106 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.212900] TRACE kernfs_node_cache free 0x00000000e2ea365c inuse=34 fp=0x00000000a8805aea 3107 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.215538] Object 00000000e2ea365c: 00 00 00 00 01 00 00 80 78 a2 37 a6 03 80 ff ff ........x.7..... 3108 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.218656] Object 000000003a5b5659: 00 52 45 de 03 80 ff ff 40 92 37 a6 03 80 ff ff .RE.....@.7..... 3109 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.221710] Object 000000007229340d: 80 b6 37 a6 03 80 ff ff 00 00 00 00 00 00 00 00 ..7............. 3110 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.224824] Object 00000000cc138d0e: 00 00 00 00 00 00 00 00 7e 16 88 71 00 00 00 00 ........~..q.... 3111 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.227947] Object 000000001cecdc04: b0 e4 a2 09 00 00 ff ff 00 00 00 00 00 00 00 00 ................ 3112 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.231088] Object 00000000c3ac2227: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 3113 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.234279] Object 00000000737356c8: 98 dc a2 09 00 00 ff ff 14 09 00 00 01 00 00 00 ................ 3114 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.237384] Object 00000000ba78c59c: 52 20 a4 81 00 00 00 00 00 00 00 00 00 00 00 00 R .............. 3115 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.240508] CPU: 3 PID: 1 Comm: systemd Tainted: G O 4.19.0-arm64-desktop-tyy-5819-ext4-slub-debug-kmemleak 3116 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.244280] Hardware name: RDO OpenStack Compute, BIOS 0.0.0 02/06/2015 3117 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.246454] Call trace: 3118 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.247264] dump_backtrace+0x0/0x190 3119 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.248512] show_stack+0x14/0x20 3120 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.249584] dump_stack+0xa8/0xcc 3121 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.250672] free_debug_processing+0x19c/0x3a0 3122 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.252143] __slab_free+0x230/0x3f8 3123 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.253321] kmem_cache_free+0x200/0x220 3124 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.254657] kernfs_put+0x100/0x238 3125 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.255806] kernfs_evict_inode+0x2c/0x38 3126 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.257119] evict+0xc0/0x1c0 3127 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.258067] iput+0x1c8/0x288 3128 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.259371] dentry_unlink_inode+0xb0/0xe8 3129 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.260600] __dentry_kill+0xc4/0x1d0 3130 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.261434] shrink_dentry_list+0x1ac/0x2c0 3131 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.262413] shrink_dcache_parent+0x78/0x80 3132 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.263336] vfs_rmdir+0xf0/0x190 3133 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.264066] do_rmdir+0x1c0/0x1f8 3134 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.264791] __arm64_sys_unlinkat+0x4c/0x60 3135 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.265710] el0_svc_common+0x90/0x178 3136 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.266968] el0_svc_handler+0x9c/0xa8 3137 2024-01-24 12:40:56 0000000g-A8cUta6pu5 kernel: [ 203.267828] el0_svc+0x8/0xc
1 2 grep "kernfs_node_cache free" kern.log | wc -l 5034
trace-bpfcc
1 2 sudo apt install systemd-dbgsym apt source systemd=241.61-deepin1
1 trace-bpfcc -tKU 'r::kernfs_new_node "%llx", retval'
从kernfs_new_node.log内容来看,主要有内核线程kworker和systemd在调用kernfs_new_node。
kworker
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 3.796820 47 47 kworker/3:1 kernfs_new_node kernfs_new_node+0x0 [kernel] sysfs_add_file_mode_ns+0x9c [kernel] internal_create_group+0x104 [kernel] sysfs_create_group+0x14 [kernel] sysfs_slab_add+0xb8 [kernel] __kmem_cache_create+0x128 [kernel] create_cache+0xcc [kernel] memcg_create_kmem_cache+0xf8 [kernel] memcg_kmem_cache_create_func+0x1c [kernel] process_one_work+0x1e8 [kernel] worker_thread+0x48 [kernel] kthread+0x128 [kernel] ret_from_fork+0x10 [kernel] -14
在 Linux 内核中,错误代码 -14 对应的含义是 EFAULT,表示发生了“无效地址”(Bad address)错误。通常情况下,EFAULT 错误表示系统调用中传递的参数指针指向的地址无效,无法访问或者无法写入。
sysfs_slab_add
tyy-1053 sysfs_slab_add 参数追踪:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 trace-bpfcc -t -Ilinux/slab.h -Ilinux/slub_def.h 'sysfs_slab_add(struct kmem_cache *s) "%llx" s->memcg_params.root_cache' TIME PID TID COMM FUNC -96.61296 23371 23371 kworker/2 :1 sysfs_slab_add ffff8001f10d078096.61350 23371 23371 kworker/2 :1 sysfs_slab_add ffff8001f10d040096.61372 23371 23371 kworker/2 :1 sysfs_slab_add ffff8001f10d388096.61390 23371 23371 kworker/2 :1 sysfs_slab_add ffff8001f10d2e0096.61434 23371 23371 kworker/2 :1 sysfs_slab_add ffff8001f131b50096.61457 23371 23371 kworker/2 :1 sysfs_slab_add ffff8001f0a3a70096.61485 23371 23371 kworker/2 :1 sysfs_slab_add ffff8001f131b88096.61504 23371 23371 kworker/2 :1 sysfs_slab_add ffff8001ff68478096.61535 23371 23371 kworker/2 :1 sysfs_slab_add ffff8001ff7d4b0096.61571 23371 23371 kworker/2 :1 sysfs_slab_add ffff8001ff7d558096.61589 23371 23371 kworker/2 :1 sysfs_slab_add ffff8001ff7d670096.61608 23371 23371 kworker/2 :1 sysfs_slab_add ffff8001ff7d750096.61624 23371 23371 kworker/2 :1 sysfs_slab_add ffff8001ff7d478096.61962 23371 23371 kworker/2 :1 sysfs_slab_add ffff8001f10d2a8096.61981 23371 23371 kworker/2 :1 sysfs_slab_add ffff8001ff68440096.62001 23371 23371 kworker/2 :1 sysfs_slab_add ffff8001ff68718096.62299 23371 23371 kworker/2 :1 sysfs_slab_add ffff8001ff687500
可以看到root_cache不是null。
ret_from_fork+0x10
1 2 3 ./scripts/faddr2line vmlinux ret_from_fork+0x10 ret_from_fork+0x10/0x18: ret_from_fork 于 arch /arm64/kernel/entry.S:1063
vim arch/arm64/kernel/entry.S +1063 1 2 3 4 5 6 7 8 9 10 11 12 1055 1058 ENTRY(ret_from_fork)1059 bl schedule_tail 1060 cbz x19, 1f 1061 mov x0, x20 1062 blr x19 1063 1 : get_thread_info tsk 1064 b ret_to_user 1065 ENDPROC(ret_from_fork) 1066 NOKPROBE(ret_from_fork)
kthread+0x128
1 2 3 ./scripts/faddr2line vmlinux kthread+0x128 kthread+0x128/0x130: kthread 于 kernel/kthread.c:246
// vim kernel/kthread.c +246 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 205 static int kthread (void *_create) 206 {207 208 struct kthread_create_info *create = _create; 209 int (*threadfn)(void *data) = create->threadfn; 210 void *data = create->data; 211 struct completion *done ; 212 struct kthread *self ; 213 int ret; 214 215 self = kzalloc(sizeof (*self), GFP_KERNEL); 216 set_kthread_struct(self); 217 218 219 done = xchg(&create->done, NULL ); 220 if (!done) { 221 kfree(create); 222 do_exit(-EINTR); 223 }224 225 if (!self) { 226 create->result = ERR_PTR(-ENOMEM); 227 complete(done); 228 do_exit(-ENOMEM); 229 }230 231 self->data = data; 232 init_completion(&self->exited); 233 init_completion(&self->parked); 234 current->vfork_done = &self->exited; 235 236 237 __set_current_state(TASK_UNINTERRUPTIBLE); 238 create->result = current; 239 complete(done); 240 schedule(); 241 242 ret = -EINTR; 243 if (!test_bit(KTHREAD_SHOULD_STOP, &self->flags)) { 244 cgroup_kthread_ready(); 245 __kthread_parkme(self); 246 ret = threadfn(data); 247 }248 do_exit(ret); 249 }
worker_thread+0x48
1 2 3 4 5 ./scripts/faddr2line vmlinux worker_thread+0x48 worker_thread+0x48/0x4c8: __read_once_size 于 include/linux/compiler.h:193 (已内连入)list_empty 于 include/linux/list.h:203 (已内连入)worker_thread 于 kernel/workqueue.c:2297
// vim kernel/workqueue.c +2297 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 2226 2236 static int worker_thread (void *__worker) 2237 {2238 struct worker *worker = __worker; 2239 struct worker_pool *pool = worker->pool; 2240 2241 2242 set_pf_worker(true );2243 woke_up:2244 spin_lock_irq(&pool->lock); 2245 2246 2247 if (unlikely(worker->flags & WORKER_DIE)) {2248 spin_unlock_irq(&pool->lock); 2249 WARN_ON_ONCE(!list_empty(&worker->entry));2250 set_pf_worker(false );2251 2252 set_task_comm(worker->task, "kworker/dying" );2253 ida_simple_remove(&pool->worker_ida, worker->id);2254 worker_detach_from_pool(worker);2255 kfree(worker);2256 return 0 ;2257 }2258 2259 worker_leave_idle(worker); 2260 recheck:2261 2262 if (!need_more_worker(pool)) 2263 goto sleep; 2264 2265 2266 if (unlikely(!may_start_working(pool)) && manage_workers(worker)) 2267 goto recheck; 2268 2269 2273 WARN_ON_ONCE(!list_empty(&worker->scheduled));2274 2275 2279 worker_clr_flags(worker, WORKER_PREP | WORKER_REBOUND);2280 2281 do {2282 struct work_struct *work =2283 list_first_entry(&pool->worklist,2284 struct work_struct, entry);2285 2286 pool->watchdog_ts = jiffies;2287 2288 if (likely(!(*work_data_bits(work) & WORK_STRUCT_LINKED))) {2289 2290 process_one_work(worker, work);2291 if (unlikely(!list_empty(&worker->scheduled)))2292 process_scheduled_works(worker);2293 } else {2294 move_linked_works(work, &worker->scheduled, NULL );2295 process_scheduled_works(worker);2296 }2297 } while (keep_working(pool));2298 2299 worker_set_flags(worker, WORKER_PREP);2300 sleep:2301 2305 worker_enter_idle(worker); 2306 __set_current_state(TASK_IDLE); 2307 spin_unlock_irq(&pool->lock); 2308 schedule(); 2309 goto woke_up; 2310 }
create_worker
// vim kernel/workqueue.c +1813 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 1779 * create_worker - 创建一个新的工作队列工作者 1780 * @pool: 新工作者将属于的工作池 1781 * 1782 * 创建并启动一个新的工作者,它附加到 @pool。 1783 * 1784 * 上下文: 1785 * 可能睡眠。进行 GFP_KERNEL 分配。 1786 * 1787 * 返回值: 1788 * 指向新创建的工作者的指针。 1789 */ 1790 static struct worker *create_worker (struct worker_pool *pool) 1791 { 1792 struct worker *worker = NULL ; 1793 int id = -1 ; 1794 char id_buf[16 ]; 1795 1796 1797 id = ida_simple_get(&pool->worker_ida, 0 , 0 , GFP_KERNEL); 1798 if (id < 0 ) 1799 goto fail; 1800 1801 worker = alloc_worker(pool->node); 1802 if (!worker) 1803 goto fail; 1804 1805 worker->id = id; 1806 1807 if (pool->cpu >= 0 ) 1808 snprintf (id_buf, sizeof (id_buf), "%d:%d%s" , pool->cpu, id, 1809 pool->attrs->nice < 0 ? "H" : "" ); 1810 else 1811 snprintf (id_buf, sizeof (id_buf), "u%d:%d" , pool->id, id); 1812 1813 worker->task = kthread_create_on_node(worker_thread, worker, pool->node,1814 "kworker/%s" , id_buf); 1815 if (IS_ERR(worker->task)) 1816 goto fail; 1817 1818 set_user_nice(worker->task, pool->attrs->nice); 1819 kthread_bind_mask(worker->task, pool->attrs->cpumask); 1820 1821 1822 worker_attach_to_pool(worker, pool); 1823 1824 1825 spin_lock_irq(&pool->lock); 1826 worker->pool->nr_workers++; 1827 worker_enter_idle(worker); 1828 wake_up_process(worker->task); 1829 spin_unlock_irq(&pool->lock); 1830 1831 return worker; 1832 1833 fail:1834 if (id >= 0 )1835 ida_simple_remove(&pool->worker_ida, id); 1836 kfree(worker); 1837 return NULL ; 1838 }
kthread_create_on_node
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 347 369 struct task_struct *kthread_create_on_node (int (*threadfn)(void *data), 370 void *data, int node, 371 const char namefmt[], 372 ...) 373 { 374 struct task_struct *task ; 375 va_list args; 376 377 va_start(args, namefmt); 378 task = __kthread_create_on_node(threadfn, data, node, namefmt, args); 379 va_end(args); 380 381 return task; 382 } 383 EXPORT_SYMBOL(kthread_create_on_node);
maybe_create_worker
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 1943 1958 static void maybe_create_worker (struct worker_pool *pool) 1959 __releases (&pool->lock) 1960 __acquires (&pool->lock) 1961 {1962 restart:1963 spin_unlock_irq(&pool->lock); 1964 1965 1966 mod_timer(&pool->mayday_timer, jiffies + MAYDAY_INITIAL_TIMEOUT);1967 1968 while (true ) {1969 if (create_worker(pool) || !need_to_create_worker(pool)) 1970 break ;1971 1972 schedule_timeout_interruptible(CREATE_COOLDOWN); 1973 1974 if (!need_to_create_worker(pool)) 1975 break ;1976 }1977 1978 del_timer_sync(&pool->mayday_timer); 1979 spin_lock_irq(&pool->lock); 1980 1984 if (need_to_create_worker(pool)) 1985 goto restart; 1986 }
process_one_work+0x1e8
1 2 3 4 5 6 ./scripts/faddr2line vmlinux process_one_work+0x1e8 process_one_work+0x1e8/0x438: arch_static_branch 于 arch /arm64/include/asm/jump_label.h:20 (已内连入)static_key_false 于 include/linux/jump_label.h:138 (已内连入)trace_workqueue_execute_end 于 include/trace/events/workqueue.h:114 (已内连入)process_one_work 于 kernel/workqueue.c:2158
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 2032 2042 static void process_one_work (struct worker *worker, struct work_struct *work) 2043 __releases (&pool->lock) 2044 __acquires (&pool->lock) 2045 {2046 struct pool_workqueue *pwq = get_work_pwq(work); 2047 struct worker_pool *pool = worker->pool; 2048 bool cpu_intensive = pwq->wq->flags & WQ_CPU_INTENSIVE; 2049 int work_color;2050 struct worker *collision ;2051 #ifdef CONFIG_LOCKDEP 2052 2056 struct lockdep_map lockdep_map ;2057 2058 lockdep_copy_map(&lockdep_map, &work->lockdep_map);2059 #endif 2060 2061 WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) &&2062 raw_smp_processor_id() != pool->cpu);2063 2064 2068 collision = find_worker_executing_work(pool, work);2069 if (unlikely(collision)) {2070 move_linked_works(work, &collision->scheduled, NULL ); 2071 return ;2072 }2073 2074 2075 debug_work_deactivate(work); 2076 hash_add(pool->busy_hash, &worker->hentry, (unsigned long )work); 2077 worker->current_work = work; 2078 worker->current_func = work->func; 2079 worker->current_pwq = pwq; 2080 work_color = get_work_color(work); 2081 2082 2085 strscpy(worker->desc, pwq->wq->name, WORKER_DESC_LEN); 2086 2087 list_del_init(&work->entry); 2088 2089 2093 if (unlikely(cpu_intensive))2094 worker_set_flags(worker, WORKER_CPU_INTENSIVE); 2095 2096 2100 if (need_more_worker(pool))2101 wake_up_worker(pool); 2102 2103 2107 set_work_pool_and_clear_pending(work, pool->id); 2108 2109 spin_unlock_irq(&pool->lock); 2110 2111 lock_map_acquire(&pwq->wq->lockdep_map); 2112 lock_map_acquire(&lockdep_map); 2113 2127 lockdep_invariant_state(true ); 2128 trace_workqueue_execute_start(work); 2129 worker->current_func(work); 2130 2133 trace_workqueue_execute_end(work); 2134 lock_map_release(&lockdep_map); 2135 lock_map_release(&pwq->wq->lockdep_map); 2136 2137 if (unlikely(in_atomic() || lockdep_depth(current) > 0 )) {2138 pr_err("BUG: workqueue leaked lock or atomic: %s/0x%08x/%d\n" 2139 " last function: %pf\n" ,2140 current->comm, preempt_count(), task_pid_nr(current),2141 worker->current_func);2142 debug_show_held_locks(current);2143 dump_stack();2144 }2145 2146 2149 cond_resched(); 2150 2151 spin_lock_irq(&pool->lock); 2152 2153 2154 if (unlikely(cpu_intensive))2155 worker_clr_flags(worker, WORKER_CPU_INTENSIVE); 2156 2157 2158 hash_del(&worker->hentry); 2159 worker->current_work = NULL ; 2160 worker->current_func = NULL ; 2161 worker->current_pwq = NULL ; 2162 pwq_dec_nr_in_flight(pwq, work_color); 2163 }
memcg_kmem_cache_create_func+0x1c
1 2 3 4 ./scripts/faddr2line vmlinux memcg_kmem_cache_create_func+0x1c memcg_kmem_cache_create_func+0x1c/0x90: css_put 于 include/linux/cgroup.h:391 (已内连入)memcg_kmem_cache_create_func 于 mm/memcontrol.c:2509
1 2 3 4 5 6 7 8 9 10 11 12 2500 static void memcg_kmem_cache_create_func (struct work_struct *w) 2501 {2502 struct memcg_kmem_cache_create_work *cw =2503 container_of(w, struct memcg_kmem_cache_create_work, work); 2504 struct mem_cgroup *memcg = cw->memcg; 2505 struct kmem_cache *cachep = cw->cachep; 2506 2507 memcg_create_kmem_cache(memcg, cachep); 2508 2509 css_put(&memcg->css); 2510 kfree(cw); 2511 }
__memcg_schedule_kmem_cache_create
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 2513 2517 static void __memcg_schedule_kmem_cache_create(struct mem_cgroup *memcg,2518 struct kmem_cache *cachep)2519 {2520 struct memcg_kmem_cache_create_work *cw ; 2521 2522 cw = kmalloc(sizeof (*cw), GFP_NOWAIT | __GFP_NOWARN); 2523 if (!cw) 2524 return ; 2525 2526 css_get(&memcg->css); 2527 2528 cw->memcg = memcg; 2529 cw->cachep = cachep; 2530 INIT_WORK(&cw->work, memcg_kmem_cache_create_func); 2531 2532 queue_work(memcg_kmem_cache_wq, &cw->work); 2533 }
__memcg_schedule_kmem_cache_create
memcg_schedule_kmem_cache_create
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 2534 static void memcg_schedule_kmem_cache_create (struct mem_cgroup *memcg, 2535 struct kmem_cache *cachep) 2536 {2537 2553 current->memcg_kmem_skip_account = 1 ; 2554 __memcg_schedule_kmem_cache_create(memcg, cachep); 2555 current->memcg_kmem_skip_account = 0 ; 2556 }
memcg_kmem_get_cache
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 2560 2571 struct kmem_cache *memcg_kmem_get_cache (struct kmem_cache *cachep) 2572 {2573 struct mem_cgroup *memcg ; 2574 struct kmem_cache *memcg_cachep ; 2575 int kmemcg_id; 2576 2577 VM_BUG_ON(!is_root_cache(cachep)); 2578 2579 if (memcg_kmem_bypass()) 2580 return cachep; 2581 2582 if (current->memcg_kmem_skip_account) 2583 return cachep; 2584 2585 memcg = get_mem_cgroup_from_current(); 2586 kmemcg_id = READ_ONCE(memcg->kmemcg_id); 2587 if (kmemcg_id < 0 ) 2588 goto out; 2589 2590 memcg_cachep = cache_from_memcg_idx(cachep, kmemcg_id); 2591 if (likely(memcg_cachep)) 2592 return memcg_cachep; 2593 2594 2602 memcg_schedule_kmem_cache_create(memcg, cachep); 2603 out:2604 css_put(&memcg->css); 2605 return cachep; 2606 }
用bpf追踪一下哪些进程会调用memcg_kmem_get_cache:
1 trace-bpfcc -t memcg_kmem_get_cache | tee memcg_kmem_get_cache1.log
输出结果如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 TIME PID TID COMM FUNC 1.620114 935 935 redis-server memcg_kmem_get_cache 1.620177 935 935 redis-server memcg_kmem_get_cache 1.620193 935 935 redis-server memcg_kmem_get_cache 1.620216 935 935 redis-server memcg_kmem_get_cache 1.720512 935 935 redis-server memcg_kmem_get_cache 1.720576 935 935 redis-server memcg_kmem_get_cache 1.720591 935 935 redis-server memcg_kmem_get_cache 1.720620 935 935 redis-server memcg_kmem_get_cache 1.787920 895 954 QDBusConnection memcg_kmem_get_cache 1.787975 895 954 QDBusConnection memcg_kmem_get_cache 1.788464 2975 2975 deepin-system-m memcg_kmem_get_cache 1.788496 2975 2975 deepin-system-m memcg_kmem_get_cache 1.788512 2975 2975 deepin-system-m memcg_kmem_get_cache 1.788527 2975 2975 deepin-system-m memcg_kmem_get_cache 1.788538 2975 2975 deepin-system-m memcg_kmem_get_cache 1.788678 2975 2975 deepin-system-m memcg_kmem_get_cache 1.788694 2975 2975 deepin-system-m memcg_kmem_get_cache 1.788705 2975 2975 deepin-system-m memcg_kmem_get_cache 1.788715 2975 2975 deepin-system-m memcg_kmem_get_cache 1.788737 2975 2975 deepin-system-m memcg_kmem_get_cache 1.788440 895 963 QThread memcg_kmem_get_cache 1.789300 4405 4405 deepin-terminal memcg_kmem_get_cache 1.789335 4405 4405 deepin-terminal memcg_kmem_get_cache 1.789422 895 965 QThread memcg_kmem_get_cache 1.789467 895 965 QThread memcg_kmem_get_cache 1.789586 895 965 QThread memcg_kmem_get_cache 1.789426 1205 1205 Xorg memcg_kmem_get_cache 1.789465 1205 1205 Xorg memcg_kmem_get_cache 1.789734 895 965 QThread memcg_kmem_get_cache 1.797259 4405 4405 deepin-terminal memcg_kmem_get_cache 1.797301 4405 4405 deepin-terminal memcg_kmem_get_cache 1.797425 1205 1205 Xorg memcg_kmem_get_cache 1.797450 1205 1205 Xorg memcg_kmem_get_cache 1.797629 4405 4405 deepin-terminal memcg_kmem_get_cache 1.797657 4405 4405 deepin-terminal memcg_kmem_get_cache 1.797735 1205 1205 Xorg memcg_kmem_get_cache 1.797751 1205 1205 Xorg memcg_kmem_get_cache 1.798109 4405 4405 deepin-terminal memcg_kmem_get_cache 1.798136 4405 4405 deepin-terminal memcg_kmem_get_cache 1.803705 895 965 QThread memcg_kmem_get_cache 1.803754 895 965 QThread memcg_kmem_get_cache 1.803770 895 965 QThread memcg_kmem_get_cache
貌似每个进程都会调用memcg_kmem_get_cache函数,也就意味着每次分配都会在内存控制组缓存中创建一个新的缓存。
slab_pre_alloc_hook
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 414 static inline struct kmem_cache *slab_pre_alloc_hook (struct kmem_cache *s, 415 gfp_t flags) 416 {417 flags &= gfp_allowed_mask; 418 419 fs_reclaim_acquire(flags); 420 fs_reclaim_release(flags); 421 422 might_sleep_if(gfpflags_allow_blocking(flags)); 423 424 if (should_failslab(s, flags)) 425 return NULL ; 426 427 if (memcg_kmem_enabled() && 428 ((flags & __GFP_ACCOUNT) || (s->flags & SLAB_ACCOUNT))) 429 return memcg_kmem_get_cache(s); 430 431 return s; 432 }
此处好像有个递归???
1 2 3 4 5 slab_pre_alloc_hook memcg_kmem_get_cache kmem_cache_alloc slab_alloc_node slab_pre_alloc_hook
这个递归解释了memcg_schedule_kmem_cache_create函数注释中提到的可能存在递归的原因。
修复方案
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 From dfd4043beff95e15bad4207cc9db9b29940e20d4 Mon Sep 17 00 :00 :00 2001 From: yuanqiliang <yuanqiliang@uniontech.com> Date: Tue, 16 Apr 2024 11 :12 :58 +0800 Subject: [PATCH] mm/memcg: very high memory usage due to kernfs_node_cache slabs Bug: https: Log: use command "slabtop -s c -o" , kernfs_node_cache is ranked first and continues to grow. SUnreclaim is also continuously increasing. The current memory management code has significant differences from upstream. This issue does not exist in upstream linux-4.19 .y. Although this commit fixes the problem, it is still recommended to sync with the upstream code. Check kernfs_node_cache allocation in /var/log /kern.log : all trace: dump_backtrace+0x0 /0x190 show_stack+0x14 /0x20 dump_stack+0xa8 /0xcc alloc_debug_processing+0x58 /0x188 ___slab_alloc.constprop.34 +0x31c /0x388 kmem_cache_alloc+0x210 /0x278 __kernfs_new_node+0x60 /0x1f8 kernfs_new_node+0x24 /0x48 kernfs_create_dir_ns+0x30 /0x88 cgroup_mkdir+0x2f0 /0x4e8 kernfs_iop_mkdir+0x58 /0x88 vfs_mkdir+0xfc /0x1c0 do_mkdirat+0xec /0x100 __arm64_sys_mkdirat+0x1c /0x28 el0_svc_common+0x90 /0x178 el0_svc_handler+0x9c /0xa8 el0_svc+0x8 /0xc To trace kernfs_new_node using BPF: trace-bpfcc -tKU kernfs_new_node3.796820 47 47 kworker/3 :1 kernfs_new_node kernfs_new_node+0x0 [kernel] sysfs_add_file_mode_ns+0x9c [kernel] internal_create_group+0x104 [kernel] sysfs_create_group+0x14 [kernel] # 上游4.19 .306 这里没走到 sysfs_slab_add+0xb8 [kernel] # 上游4.19 .306 这里走到了 __kmem_cache_create+0x128 [kernel] create_cache+0xcc [kernel] memcg_create_kmem_cache+0xf8 [kernel] memcg_kmem_cache_create_func+0x1c [kernel] process_one_work+0x1e8 [kernel] worker_thread+0x48 [kernel] kthread+0x128 [kernel] ret_from_fork+0x10 [kernel]-14 Signed-off-by: yuanqiliang <yuanqiliang@uniontech.com> Change-Id: Ia13f72f6a0b8e9944fd8dafb6f68c7170081a591 --- diff --git a/mm/slab_common.c b/mm/slab_common.c index 1519956. .93343e5 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -147 ,6 +147 ,7 @@ if (root_cache) { s->memcg_params.root_cache = root_cache; + s->memcg_params.root_cache->memcg_kset = NULL ; s->memcg_params.memcg = memcg; INIT_LIST_HEAD(&s->memcg_params.children_node); INIT_LIST_HEAD(&s->memcg_params.kmem_caches_node);
linux-4.19.y上*(s->memcg_params.root_cache->memcg_kset) = 0
, uos-arm-kernel-1053-tyy上*(s->memcg_params.root_cache->memcg_kset)
是个随机值。
kmem_cache通过双向链表管理,kmem_cache父节点的memcg_kset会传导给子节点,故第一个非根节点的kset需要初始化为null。
本次问题虽然修复了,但是感觉uos-arm-kernel-1053-tyy上存在一个CVE。
修复后跟上游对比
systemd
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 TIME PID TID COMM FUNC 3.781989 1 1 systemd kernfs_new_node kernfs_new_node+0x0 [kernel] cgroup_mkdir+0x2f0 [kernel] kernfs_iop_mkdir+0x58 [kernel] vfs_mkdir+0xfc [kernel] do_mkdirat+0xec [kernel] __arm64_sys_mkdirat+0x1c [kernel] el0_svc_common+0x90 [kernel] el0_svc_handler+0x9c [kernel] el0_svc+0x8 [kernel] mkdir +0x14 [libc-2.28.so] [unknown] [libsystemd-shared-241.so (deleted)] [unknown] [libsystemd-shared-241.so (deleted)] [unknown] [systemd (deleted)] [unknown] [systemd (deleted)] [unknown] [systemd (deleted)] [unknown] [systemd (deleted)] [unknown] [systemd (deleted)] [unknown] [systemd (deleted)] [unknown] [systemd (deleted)] [unknown] [systemd (deleted)] [unknown] [libsystemd-shared-241.so (deleted)] [unknown] [libsystemd-shared-241.so (deleted)] [unknown] [libsystemd-shared-241.so (deleted)] [unknown] [systemd (deleted)] [unknown] [systemd (deleted)] __libc_start_main+0xe4 [libc-2.28.so] [unknown] [systemd (deleted)]
1 2 bpftrace -l | grep cgroup_mkdir tracepoint:cgroup:cgroup_mkdir
1 trace-bpfcc -tKU cgroup_mkdir | tee cgroup_mkdir.log
vim cgroup_mkdir.log:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 TIME PID TID COMM FUNC 11.08041 1 1 systemd cgroup_mkdir cgroup_mkdir+0x0 [kernel] vfs_mkdir+0xfc [kernel] do_mkdirat+0xec [kernel] __arm64_sys_mkdirat+0x1c [kernel] el0_svc_common+0x90 [kernel] el0_svc_handler+0x9c [kernel] el0_svc+0x8 [kernel] [unknown] [libc-2.28.so (deleted)] cg_create.localalias.13+0x64 [libsystemd-shared-241.so] cg_create_everywhere+0x30 [libsystemd-shared-241.so] unit_create_cgroup+0xb4 [systemd] unit_realize_cgroup_now.lto_priv.665+0xa8 [systemd] unit_realize_cgroup+0x16c [systemd] unit_prepare_exec+0x18 [systemd] service_spawn.lto_priv.382+0x80 [systemd] service_start.lto_priv.56+0x144 [systemd] job_perform_on_unit.lto_priv.422+0x6b4 [systemd] manager_dispatch_run_queue.lto_priv.589+0x358 [systemd] source_dispatch+0x118 [libsystemd-shared-241.so] sd_event_dispatch+0x150 [libsystemd-shared-241.so] sd_event_run+0x90 [libsystemd-shared-241.so] invoke_main_loop+0xff4 [systemd] main+0x1660 [systemd] [unknown] [libc-2.28.so (deleted)] [unknown] [systemd] 11.08056 1 1 systemd cgroup_mkdir cgroup_mkdir+0x0 [kernel] vfs_mkdir+0xfc [kernel] do_mkdirat+0xec [kernel] __arm64_sys_mkdirat+0x1c [kernel] el0_svc_common+0x90 [kernel] el0_svc_handler+0x9c [kernel] el0_svc+0x8 [kernel] [unknown] [libc-2.28.so (deleted)] cg_create.localalias.13+0x64 [libsystemd-shared-241.so] cg_create.localalias.13+0xe8 [libsystemd-shared-241.so] cg_create_everywhere+0x30 [libsystemd-shared-241.so] unit_create_cgroup+0xb4 [systemd] unit_realize_cgroup_now.lto_priv.665+0xa8 [systemd] unit_realize_cgroup+0x16c [systemd] unit_prepare_exec+0x18 [systemd] service_spawn.lto_priv.382+0x80 [systemd] service_start.lto_priv.56+0x144 [systemd] job_perform_on_unit.lto_priv.422+0x6b4 [systemd] manager_dispatch_run_queue.lto_priv.589+0x358 [systemd] source_dispatch+0x118 [libsystemd-shared-241.so] sd_event_dispatch+0x150 [libsystemd-shared-241.so] sd_event_run+0x90 [libsystemd-shared-241.so] invoke_main_loop+0xff4 [systemd] main+0x1660 [systemd] [unknown] [libc-2.28.so (deleted)] [unknown] [systemd]
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 1626 static int unit_create_cgroup ( 1627 Unit *u,1628 CGroupMask target_mask,1629 CGroupMask enable_mask,1630 ManagerState state) {1631 1632 bool created;1633 int r;1634 1635 assert(u);1636 1637 if (!UNIT_HAS_CGROUP_CONTEXT(u))1638 return 0 ;1639 1640 1641 r = unit_pick_cgroup_path(u);1642 if (r < 0 )1643 return r;1644 1645 1646 r = cg_create_everywhere(u->manager->cgroup_supported, target_mask, u->cgroup_path);1647 if (r < 0 )1648 return log_unit_error_errno(u, r, "Failed to create cgroup %s: %m" , u->cgroup_path);1649 created = r;
systemd log
1 2 grep -i failed syslog | grep systemd | wc -l 146225
1 grep -i failed syslog | grep systemd | grep .service | grep -v "Failed to start" > systemd.log
deepin-anything-monitor.service
1 2 grep deepin-anything-monitor.service systemd.log | wc -l 835
1 2 3 4 5 6 systemctl status deepin-anything-monitor.service ● deepin-anything-monitor.service - Deepin anything service Loaded: loaded (/lib/systemd/system/deepin-anything-monitor.service; enabled; vendor preset: enabled) Active: activating (auto-restart) (Result: exit-code) since Thu 2024-01-25 10:24:27 CST; 2s ago Process: 19672 ExecStartPre=/usr/sbin/modprobe vfs_monitor (code=exited, status=1/FAILURE) Process: 19674 ExecStopPost=/usr/sbin/rmmod vfs_monitor (code=exited, status=1/FAILURE)
logrotate.service
1 2 grep logrotate.service systemd.log | wc -l 568
1 2 3 4 5 2024-01-01 00:00:12 0000000g-A8cUta6pu5 systemd[1]: logrotate.service: Failed with result 'exit-code' . 2024-01-01 01:00:01 0000000g-A8cUta6pu5 systemd[1]: logrotate.service: Failed with result 'exit-code' . 2024-01-01 02:00:01 0000000g-A8cUta6pu5 systemd[1]: logrotate.service: Failed with result 'exit-code' . 2024-01-01 03:00:21 0000000g-A8cUta6pu5 systemd[1]: logrotate.service: Failed with result 'exit-code' . 2024-01-01 04:00:31 0000000g-A8cUta6pu5 systemd[1]: logrotate.service: Failed with result 'exit-code' .
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 systemctl status logrotate.service ● logrotate.service - Rotate log files Loaded: loaded (/lib/systemd/system/logrotate.service; static; vendor preset: enabled) Active: failed (Result: exit-code) since Thu 2024-01-25 09:00:05 CST; 53min ago Docs: man:logrotate(8) man:logrotate.conf(5) Process: 7067 ExecStart=/usr/sbin/logrotate /etc/logrotate.conf (code=exited, status=1/FAILURE) Main PID: 7067 (code=exited, status=1/FAILURE) 1月 25 09:00:04 0000000g-A8cUta6pu5 logrotate[7067]: error: clink-agent:8, unexpected text after } 1月 25 09:00:04 0000000g-A8cUta6pu5 logrotate[7067]: error: skipping "/var/log/apt/term.log" because parent directory has insecure permissions (It's world writable or writable by group which is not "root") Set "su 1月 25 09:00:04 0000000g-A8cUta6pu5 logrotate[7067]: error: skipping "/var/log/apt/history.log" because parent directory has insecure permissions (It' s world writable or writable by group which is not "root" ) Set 1月 25 09:00:04 0000000g-A8cUta6pu5 logrotate[7067]: error: skipping "/var/log/mirror/printer/printer.log" because parent directory has insecure permissions (It's world writable or writable by group which is not " 1月 25 09:00:04 0000000g-A8cUta6pu5 logrotate[7067]: error: skipping "/var/log/cups/access_log" because parent directory has insecure permissions (It' s world writable or writable by group which is not "root" ) Set 1月 25 09:00:04 0000000g-A8cUta6pu5 logrotate[7067]: error: skipping "/var/log/cups/error_log" because parent directory has insecure permissions (It's world writable or writable by group which is not "root") Set " 1月 25 09:00:04 0000000g-A8cUta6pu5 logrotate[7067]: error: skipping "/var/log/mirror/ctyunInstall/ctyunInstall.log" because parent directory has insecure permissions (It' s world writable or writable by group whic 1月 25 09:00:05 0000000g-A8cUta6pu5 systemd[1]: logrotate.service: Main process exited, code=exited, status=1/FAILURE 1月 25 09:00:05 0000000g-A8cUta6pu5 systemd[1]: logrotate.service: Failed with result 'exit-code' . 1月 25 09:00:05 0000000g-A8cUta6pu5 systemd[1]: Failed to start Rotate log files.
avahi-daemon.service
1 2 uptime 10:36:02 up 18:29, 1 user, load average: 0.18, 0.27, 0.25
1 2 grep dbus-org.freedesktop.Avahi.service systemd.log | wc -l 143146
1 2 3 4 2024-01-16 05:45:03 0000000g-A8cUta6pu5 dbus-daemon[539]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.Avahi.service' : Unit dbus-org.freedesktop.Avahi.service not found. 2024-01-16 05:45:08 0000000g-A8cUta6pu5 dbus-daemon[539]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.Avahi.service' : Unit dbus-org.freedesktop.Avahi.service not found. 2024-01-16 05:45:13 0000000g-A8cUta6pu5 dbus-daemon[539]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.Avahi.service' : Unit dbus-org.freedesktop.Avahi.service not found. 2024-01-16 05:45:18 0000000g-A8cUta6pu5 dbus-daemon[539]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.Avahi.service' : Unit dbus-org.freedesktop.Avahi.service not found.
5S重启一次,本次系统启动也就18个半钟头,跑了143146次。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 systemctl cat avahi-daemon.service [Unit] Description=Avahi mDNS/DNS-SD Stack Requires=avahi-daemon.socket [Service] Type=dbus BusName=org.freedesktop.Avahi ExecStart=/usr/sbin/avahi-daemon -s ExecReload=/usr/sbin/avahi-daemon -r NotifyAccess=main [Install] WantedBy=multi-user.target Also=avahi-daemon.socket Alias=dbus-org.freedesktop.Avahi.service
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 apt show avahi-daemon Package: avahi-daemon Version: 0.7.6-1+dde Priority: optional Section: net Source: avahi Maintainer: Utopia Maintenance Team <pkg-utopia-maintainers@lists.alioth.debian.org> Installed-Size: 263 kB Depends: libavahi-common3 (>= 0.6.16), libavahi-core7 (>= 0.6.24), libc6 (>= 2.27), libcap2 (>= 1:2.10), libdaemon0 (>= 0.14), libdbus-1-3 (>= 1.9.14), libexpat1 (>= 2.0.1), adduser, dbus (>= 0.60), lsb-base (>= 3.0-6), bind9-host | host Recommends: libnss-mdns (>= 0.11) Suggests: avahi-autoipd Homepage: http://avahi.org/ Download-Size: 89.4 kB APT-Manual-Installed: no APT-Sources: https://professional-packages.chinauos.com/desktop-professional eagle/main arm64 Packages Description: Avahi mDNS/DNS-SD daemon Avahi is a fully LGPL framework for Multicast DNS Service Discovery. It allows programs to publish and discover services and hosts running on a local network with no specific configuration. For example you can plug into a network and instantly find printers to print to, files to look at and people to talk to. . This package contains the Avahi Daemon which represents your machine on the network and allows other applications to publish and resolve mDNS/DNS-SD records.
解决方案
1 2 3 sudo apt reinstall avahi-daemon sudo systemctl daemon-reload sudo systemctl restart avahi-daemon
More
patch