pt620k 1031 freezes on desktop after password entry

pt620k 1031 freezes on desktop after password entry

开机进入桌面后,桌面有显示,鼠标键盘无响应,键盘灯常亮。

https://pms.uniontech.com/bug-view-227273.html

初步分析

1
2
3
cat /proc/cmdline

BOOT_IMAGE=/vmlinuz-4.19.0-arm64-desktop-kdump-lockdep root=UUID=bd4b592f-4763-45a0-91cb-a055c8addaa8 ro splash initcall_debug ignore_loglevel console=ttyAMA0,115200 no_console_suspend "dyndbg=file *usb* +p" plymouth.ignore-serial-consoles DEEPIN_GFXMODE= crashkernel=1024M-:512M
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
# 大坪监狱串口日志-20231228-2.log

Unhandled Exception in EL3.
x30 = 0x000000003fc96bdc
x0 = 0x000000000000000a
x1 = 0x0000000094080000
x2 = 0x0000000000000018
x3 = 0x000000003fc96a44
x4 = 0x0000000000000000
x5 = 0x000000003ff3b5f0
x6 = 0x000000003fc991d9
x7 = 0x000000003ff3b5f0
x8 = 0x0000000000000000
x9 = 0x00008023e5c49000
x10 = 0x0000ffff8150a150
x11 = 0x0000000000000000
x12 = 0x0000000000000000
x13 = 0x0000000000000000
x14 = 0x0000000000000000
x15 = 0x0000000000000000
x16 = 0x0000000000000058
x17 = 0x0000000000000000
x18 = 0xffff8023b6bd4a40
x19 = 0x0000000000000001
x20 = 0x0000000000000025
x21 = 0xffff800000000100
x22 = 0xffff00000a795ff0
x23 = 0x00000000a04003c9
x24 = 0x0000000000000025
x25 = 0xffff8023ef0b82f8
x26 = 0xffff00000908ebb8
x27 = 0xffff80000181ee50
x28 = 0xffff8023b6bd4a40
x29 = 0x000000003fcfd210
scr_el3 = 0x000000000000073d
sctlr_el3 = 0x0000000030cd183f
cptr_el3 = 0x0000000000000000
tcr_el3 = 0x0000000080853511
daif = 0x00000000000003c0
mair_el3 = 0x00000000004404ff
spsr_el3 = 0x00000000604003cd
elr_el3 = 0x000000003fc96bec
ttbr0_el3 = 0x000000003fedf800
esr_el3 = 0x0000000002000000
far_el3 = 0xffff7dfdffffff10
spsr_el1 = 0x0000000000000000
elr_el1 = 0x0000000000000000
spsr_abt = 0x0000000000000000
spsr_und = 0x0000000000000000
spsr_irq = 0x0000000000000000
spsr_fiq = 0x0000000000000000
sctlr_el1 = 0x0000000030500980
actlr_el1 = 0x0000000000000080
cpacr_el1 = 0x0000000000300000
csselr_el1 = 0x0000000000000000
sp_el1 = 0x0000000000000000
esr_el1 = 0x0000000000000000
ttbr0_el1 = 0x0000000000000000
ttbr1_el1 = 0x0000000000000000
mair_el1 = 0x0000000000000000
amair_el1 = 0x0000000000000000

EL3 表示 Exception Level 3,是 ARM TrustZone 技术中的最高特权级别。在这个级别,通常运行的是安全的监管模式。

异常可能是由硬件故障、固件问题、或者内核代码中的错误引起的。

esr_el3 = 0x0000000002000000

十六进制 0x0000000002000000 对应的二进制是 00000000000000000000000000000010000000000000000000000000。

从这个二进制表示中,提取 EC 字段,即取从第 26 位到第 31 位的二进制值,得到:

1
EC = 0b100000

将这个二进制值转换为十进制,得到 32。所以,对于 esr_el3 = 0x0000000002000000,EC 字段的值是 32。

EC Meaning ISS Applies when
0b100000 Instruction Abort from a lower Exception level. Used for MMU faults generated by instruction accesses and synchronous External aborts, including synchronous parity or ECC errors. Not used for debug-related exceptions. ISS encoding for an exception from an Instruction Abort
0b100000 来自较低异常级别的指令中断 用于由指令访问引起的MMU故障以及同步的外部中断,包括同步的奇偶校验错误或ECC错误。不用于与调试相关的异常 ISS(Instruction Specific Syndrome)编码用于指令中断引起的异常
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# 大坪监狱串口日志-20231228-4.log

[ 18.324953] Call trace:
[ 18.327387] dump_backtrace+0x0/0x190
[ 18.331034] show_stack+0x14/0x20
[ 18.334336] dump_stack+0xc8/0x104
[ 18.337724] validate_chain.isra.20+0xc20/0xcb0
[ 18.342241] __lock_acquire+0x2e0/0x700
[ 18.346063] lock_acquire+0xe0/0x230
[ 18.349624] __mutex_lock+0x7c/0x840
[ 18.353185] mutex_lock_interruptible_nested+0x1c/0x28
[ 18.358314] FUNC206LXDEV116+0x5c/0xb8 [mwv206]
[ 18.362838] FUNC206HAL282+0x14/0x28 [mwv206]
[ 18.367188] FUNC206HAL398+0x3c/0x470 [mwv206]
[ 18.371623] FUNC206HAL274+0x5c/0x84 [mwv206]
[ 18.375972] FUNC206HAL002.isra.1.constprop.2+0x504/0xc48 [mwv206]
[ 18.382144] FUNC206HAL148+0x348/0x5e0 [mwv206]
[ 18.386666] FUNC206LXDEV064+0x130/0x408 [mwv206]
[ 18.391356] pci_device_probe+0xdc/0x188
[ 18.395265] really_probe+0x23c/0x3c8
[ 18.398913] driver_probe_device+0x64/0x130
[ 18.403081] __driver_attach+0x134/0x160
[ 18.406990] bus_for_each_dev+0x60/0x98
[ 18.410811] driver_attach+0x20/0x28
[ 18.414372] bus_add_driver+0x1a0/0x280
[ 18.418193] driver_register+0x60/0x110
[ 18.422015] __pci_register_driver+0x58/0x68
[ 18.426278] FUNC206LXDEV054+0xb4/0x1000 [mwv206]
[ 18.430967] do_one_initcall+0x70/0x330
[ 18.434789] do_init_module+0x58/0x1d0
[ 18.438523] load_module+0x1308/0x15b8
[ 18.442258] __se_sys_finit_module+0x84/0xc8
[ 18.446513] __arm64_sys_finit_module+0x18/0x20
[ 18.451030] el0_svc_common+0x9c/0x188
[ 18.454764] el0_svc_handler+0x9c/0xa8
[ 18.458499] el0_svc+0x8/0xc
1
2
3
4
5
6
7
8
9
10
11
12
modinfo mwv206

filename: /lib/modules/4.19.0-arm64-desktop-kdump-lockdep/kernel/drivers/gpu/drm/mwv206/mwv206.ko
license: GPL v2
description: JM7200 GPU driver
author: rfshen <jjwgpu@jingjiamicro.com>
alias: pci:v00000731d00007200sv*sd*bc*sc*i*
depends: snd-pcm,snd
intree: Y
name: mwv206
vermagic: 4.19.0-arm64-desktop-kdump-lockdep SMP mod_unload modversions aarch64
parm: disable_vblank_sync:Disable vblank sync, state change occurs when no 3D is running (int)

景嘉微显卡驱动区存在死锁有问题。

1
2
3
4
5
6
7
# 大坪监狱串口日志-20231228-4.log

[ 40.131272] usb 1-2.3: ep 04: release iso @ 7+8 (0.7+1) [6/0 us] mask 0000
[ 40.197002] usb 1-2.3: ep 04: reserve iso @ 7+8 (0.7+1) [6/0 us] mask 0000
[ 40.239818] usb 1-2.3: ep 81: reserve iso @ 7+8 (0.7+1) [3/0 us] mask 0000
[ 45.283141] usb 1-2.3: ep 81: release iso @ 7+8 (0.7+1) [3/0 us] mask 0000
[ 45.543271] usb 1-2.3: ep 04: release iso @ 7+8 (0.7+1) [6/0 us] mask 0000

usb端点04、81的iso缓存都被释放了,导致鼠标键盘无响应。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
vim 20240104-2.log +3520

3520 [ 269.934461] CPU: 6 PID: 3940 Comm: lastore-smartmi Kdump: loaded Tainted: G B W 4.19.0-arm64-desktop-kdump-lockdep #3103
3521 [ 269.939758] proc_evict_inode+0x20/0x58
3522 [ 269.939760] evict+0xa0/0x170
3523 [ 269.943406] Hardware name: N/A N/A/Kunpeng Desktop Board D920S10, BIOS 2.0 01/08/2021
3524 [ 269.946706] iput+0x1f0/0x2e0
3525 [ 269.946708] dentry_unlink_inode+0xac/0xe0
3526 [ 269.952004] Call trace:
3527 [ 269.955391] __dentry_kill+0xc0/0x1b0
3528 [ 269.955392] dentry_kill+0x4c/0x1a0
3529 [ 269.958692] dump_backtrace+0x0/0x190
3530 [ 269.958693] show_stack+0x14/0x20
3531 [ 269.963989] dput+0x2b4/0x2f8
3532 [ 269.963991] path_put+0x18/0x30
3533 [ 269.967984] dump_stack+0xc8/0x104
3534 [ 269.967985] bad_page+0x104/0x130
3535 [ 269.972500] vfs_statx+0xa0/0xd8
3536 [ 269.972502] __se_sys_newfstatat+0x24/0x48
3537 [ 269.977797] check_new_page_bad+0x6c/0x90
3538 [ 269.977799] get_page_from_freelist+0x484/0xda8
3539 [ 269.982313] __arm64_sys_newfstatat+0x18/0x20
3540 [ 269.982316] el0_svc_common+0x9c/0x188
3541 [ 269.987090] __alloc_pages_nodemask+0xf4/0x1010
3542 [ 269.987091] do_huge_pmd_anonymous_page+0xf0/0x800
3543 [ 269.992387] el0_svc_handler+0x9c/0xa8
3544 [ 269.992389] el0_svc+0x8/0xc
3545 [ 269.996469] __handle_mm_fault+0x738/0x8a0
3546 [ 269.996471] handle_mm_fault+0x1c4/0x330
3547 [ 270.000379] Code: f9007260 a94153f3 a8c27bfd d65f03c0 (d4210000)
3548 [ 270.009147] do_page_fault+0x194/0x4a8
3549 [ 270.009148] do_translation_fault+0x58/0x60
3550 [ 270.017918] SMP: stopping secondary CPUs
3551 [ 270.020695] do_mem_abort+0x3c/0xd0
3552 [ 270.020696] el0_da+0x20/0x24
3553 [ 270.497131] BUG: Bad page state in process lastore-smartmi pfn:23be207
3554 [ 270.503731] page:ffff7e008ef881c0 count:0 mapcount:0 mapping:0000000000000010 index:0x0
3555 [ 270.511720] flags: 0x4000000000000000()
3556 [ 270.515543] raw: 4000000000000000 0000000000000000 ffff7e008ef881c8 0000000000000010
3557 [ 270.523272] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
3558 [ 270.530999] page dumped because: non-NULL mapping

可能存在物理内存硬件问题导致物理页面状态异常,进一步导致鼠标键盘无响应。

修复方案

更新景嘉微驱动,新版本驱动可能修复死锁问题。

更换内存条。

More

Exception Level

在 ARM 体系结构中,Exception Level(EL)是指不同的特权级别,从 EL0 到 EL3。这些级别对应了 ARM 处理器上不同的执行上下文。以下是 EL0 到 EL3 的简要解释:

EL0(Exception Level 0):用户空间(User Space)

  • EL0 是最低的特权级别,对应用户空间。
  • 在 EL0 上执行的代码是用户态的应用程序代码。
  • EL0 没有对整个系统状态的完全控制权限,只能执行受限的操作。

EL1(Exception Level 1):内核空间(Kernel Space)

  • EL1 对应内核空间,也称为监管模式。

  • 在 EL1 上执行的代码是操作系统内核代码,具有对整个系统状态的完全控制权限。

  • 当用户空间程序执行系统调用或触发异常时,处理器会从 EL0 切换到 EL1 来执行内核代码。

  • EL2(Exception Level 2):Hypervisor 模式

  • EL2 是虚拟化扩展的一部分,对应虚拟机监视器(Hypervisor)模式。

  • 当系统在虚拟化环境中运行时,EL2 用于执行虚拟机监视器代码,以监视和管理虚拟机。

EL3(Exception Level 3):Secure Monitor 模式

  • EL3 对应 Secure Monitor 模式,主要用于 ARM TrustZone 技术中的安全执行环境。
  • Secure Monitor 是一个特殊的监管模式,用于执行安全任务,例如处理 TrustZone 安全事件。

在 ARM 64 位体系结构(AArch64),EL2 和 EL3 是可选的,而在 ARM 32 位体系结构(AArch32),EL2 和 EL3 是必需的。这些不同的 EL 提供了不同特权级别,以实现对系统硬件和软件的灵活控制,同时支持虚拟化和安全性需求。


pt620k 1031 freezes on desktop after password entry
https://realwujing.github.io/linux/kernel/bugs/gpu/227273/pt620k 1031 freezes on desktop after password entry/
作者
Wu Jing
发布于
2024年7月26日
许可协议