VM 코어덤프 분석 (비정상적인 시스템 종료)
1. crash에서 log 분석
log 명령의 실행 결과 아래와 같이 BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 메시지를 확인함.
crash> log
(..생략..)
cal.start_seq 4014915650 != tcph seq 4014915916 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; lo
cal.start_seq 4014915650 != tcph seq 4014915916 | tb_tcpv6_conn.c:927
+ BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: [<ffffffffa019dc6a>] dsa_slim_input+0x7a/0xc90 [dsa_filter]
PGD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/online
CPU 0
Modules linked in: gsch(U) redirfs(U) nfs lockd fscache auth_rpcgss nfs_acl sunrpc
ipv6 dsa_filter(P)(U) ppdev parport_pc parport microcode xen_netfront sg i2c_piix4
i2c_core ext4 jbd2 mbcache sr_mod cdrom xen_blkfront pata_acpi ata_generic ata_piix
dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
(..생략..)
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 메시지를 포함한 프로세스(Pid 64616)에 대한 정보는 다음과 같음.
+ httpd: page allocation failure. order:5, mode:0x20 [478/2761]
+ Pid: 64616, comm: httpd Tainted: P -------------- 2.6.32-504.el6.x86_64 #1
Call Trace:
<IRQ> [<ffffffff8113438a>] ? __alloc_pages_nodemask+0x74a/0x8d0
[<ffffffffa01a3de0>] ? stateful_tcp_filter+0x870/0x11b0 [dsa_filter]
[<ffffffffa01a4765>] ? stateful_process+0x45/0xc0 [dsa_filter]
[<ffffffff81173332>] ? kmem_getpages+0x62/0x170
[<ffffffff81173f4a>] ? fallback_alloc+0x1ba/0x270
[<ffffffff8117399f>] ? cache_grow+0x2cf/0x320
[<ffffffff81173cc9>] ? ____cache_alloc_node+0x99/0x160
[<ffffffff81451ea2>] ? pskb_expand_head+0x62/0x280
[<ffffffff81174a99>] ? __kmalloc+0x189/0x220
[<ffffffff81451ea2>] ? pskb_expand_head+0x62/0x280
[<ffffffff8145278a>] ? __pskb_pull_tail+0x2aa/0x360
[<ffffffffa01e3b89>] ? lin_nf_packet_wrapper.clone.0+0x49/0x3d0 [dsa_filter]
[<ffffffff8149cdd0>] ? ip_finish_output+0x0/0x310
[<ffffffffa012eb4f>] ? xennet_start_xmit+0x5ef/0x7cc [xen_netfront]
[<ffffffffa01e40bc>] ? lin_nf_packet_wrapper_all.clone.1+0x1ac/0x1e0 [dsa_filter]
[<ffffffff8149cdd0>] ? ip_finish_output+0x0/0x310
[<ffffffffa01e4111>] ? lin_nf_packet_wrapper_inet+0x21/0x30 [dsa_filter]
[<ffffffff8148bdc9>] ? nf_iterate+0x69/0xb0
[<ffffffff8149cdd0>] ? ip_finish_output+0x0/0x310
[<ffffffff8148bf86>] ? nf_hook_slow+0x76/0x120
[<ffffffff8149cdd0>] ? ip_finish_output+0x0/0x310
[<ffffffff8149d184>] ? ip_output+0xa4/0xc0
[<ffffffffa01a4765>] ? stateful_process+0x45/0xc0 [dsa_filter]
[<ffffffff8149c475>] ? ip_local_out+0x25/0x30
[<ffffffff8149c970>] ? ip_queue_xmit+0x190/0x420
[<ffffffffa0190047>] ? core_pkt_hook+0x267/0x8b0 [dsa_filter]
[<ffffffff814b2024>] ? tcp_transmit_skb+0x4b4/0x8b0
[<ffffffff814b456a>] ? tcp_write_xmit+0x1da/0xa90
[<ffffffff814b5150>] ? __tcp_push_pending_frames+0x30/0xe0
[<ffffffff814ac733>] ? tcp_data_snd_check+0x33/0x100
[<ffffffff814b03c1>] ? tcp_rcv_established+0x391/0x7e0
[<ffffffff8152d025>] ? _read_unlock_bh+0x15/0x20
[<ffffffffa01e4ad0>] ? lin_pkt_get_frame_header+0x0/0x5d0 [dsa_filter]
[<ffffffffa01e4730>] ? lin_pkt_get_length+0x0/0x20 [dsa_filter]
[<ffffffffa01e4990>] ? lin_pkt_read_start+0x0/0x140 [dsa_filter]
[<ffffffff814b8893>] ? tcp_v4_do_rcv+0x2e3/0x490
[<ffffffff814ba1a2>] ? tcp_v4_rcv+0x522/0x900
[<ffffffffa01e4111>] ? lin_nf_packet_wrapper_inet+0x21/0x30 [dsa_filter]
[<ffffffff81496ded>] ? ip_local_deliver_finish+0xdd/0x2d0
[<ffffffff81497078>] ? ip_local_deliver+0x98/0xa0
[<ffffffff8149653d>] ? ip_rcv_finish+0x12d/0x440
[<ffffffff81496ac5>] ? ip_rcv+0x275/0x350
[<ffffffff8145c88b>] ? __netif_receive_skb+0x4ab/0x750
[<ffffffff81460588>] ? netif_receive_skb+0x58/0x60
[<ffffffffa012da96>] ? xennet_poll+0xba6/0xd50 [xen_netfront]
[<ffffffff81462083>] ? net_rx_action+0x103/0x2f0
[<ffffffff810eaac2>] ? handle_IRQ_event+0x92/0x170
[<ffffffff8107d8b1>] ? __do_softirq+0xc1/0x1e0
[<ffffffff810b034a>] ? tick_program_event+0x2a/0x30
[<ffffffff8100c30c>] ? call_softirq+0x1c/0x30
[<ffffffff8100fc15>] ? do_softirq+0x65/0xa0
[<ffffffff8107d765>] ? irq_exit+0x85/0x90
[<ffffffff813239b5>] ? xen_evtchn_do_upcall+0x35/0x50
[<ffffffff8100c433>] ? xen_hvm_callback_vector+0x13/0x20
<EOI> net.tcp/1 | SYN invalid retransmit in(remote set=1) ; remote.start_seq 3256175445 != tcph seq 2388075115 | tb_tcpv6_conn.c:867
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 937195752 != tcph seq 937195753 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 937195752 != tcph seq 937195753 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 937195752 != tcph seq 937195753 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 937195752 != tcph seq 937195815 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 937195752 != tcph seq 937195829 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 937195752 != tcph seq 937195836 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 937195752 != tcph seq 937195855 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 937195752 != tcph seq 937195949 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 937195752 != tcph seq 937195949 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 937195752 != tcph seq 937195949 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 937195752 != tcph seq 937196033 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 937195752 != tcph seq 937196122 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 937195752 != tcph seq 937196221 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 937195752 != tcph seq 937196221 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 937195752 != tcph seq 937196472 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 937195752 != tcph seq 937196576 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 937195752 != tcph seq 937196673 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 937195752 != tcph seq 937196673 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 937195752 != tcph seq 937196768 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 937195752 != tcph seq 937196779 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 937195752 != tcph seq 937196869 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 1501230080 != tcph seq 1501230081 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 1501230080 != tcph seq 1501230081 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 1501230080 != tcph seq 1501230081 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 1501230080 != tcph seq 1501230143 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 1501230080 != tcph seq 1501230157 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 1501230080 != tcph seq 1501230164 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 1501230080 != tcph seq 1501230183 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 1501230080 != tcph seq 1501230277 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 1501230080 != tcph seq 1501230277 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 1501230080 != tcph seq 1501230361 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 1501230080 != tcph seq 1501230451 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 1501230080 != tcph seq 1501230550 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 1501230080 != tcph seq 1501230550 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 1501230080 != tcph seq 1501230801 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 1501230080 != tcph seq 1501230905 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 1501230080 != tcph seq 1501231002 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 1501230080 != tcph seq 1501231002 | tb_tcpv6_conn.c:927
(..생략..)
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 2699821124 != tcph seq 2699824124 | tb_tcpv6_conn.c:927 [95/1981]
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 2699821124 != tcph seq 2699824210 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 2699821124 != tcph seq 2699824308 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 2699821124 != tcph seq 2699824308 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 2699821124 != tcph seq 2699824444 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 2699821124 != tcph seq 2699824444 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 2699821124 != tcph seq 2699824580 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 2699821124 != tcph seq 2699824580 | tb_tcpv6_conn.c:927
net.tcp/1 | SYN invalid retransmit in(remote set=1) ; remote.start_seq 3897748963 != tcph seq 3498325270 | tb_tcpv6_conn.c:867
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 4014915650 != tcph seq 4014915651 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 4014915650 != tcph seq 4014915651 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 4014915650 != tcph seq 4014915651 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 4014915650 != tcph seq 4014915713 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 4014915650 != tcph seq 4014915731 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 4014915650 != tcph seq 4014915738 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 4014915650 != tcph seq 4014915757 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 4014915650 != tcph seq 4014915905 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 4014915650 != tcph seq 4014915905 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 4014915650 != tcph seq 4014915916 | tb_tcpv6_conn.c:927
net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 4014915650 != tcph seq 4014915916 | tb_tcpv6_conn.c:927
+ BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
+ IP: [<ffffffffa019dc6a>] dsa_slim_input+0x7a/0xc90 [dsa_filter]
PGD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/online
CPU 0
Modules linked in: gsch(U) redirfs(U) nfs lockd fscache auth_rpcgss nfs_acl sunrpc ipv6 dsa_filter(P)(U) ppdev parport_pc parport microcode xen_netfront sg i2c_piix4 i2c_co
re ext4 jbd2 mbcache sr_mod cdrom xen_blkfront pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
Pid: 99088, comm: httpd Tainted: P --------------- 2.6.32-504.el6.x86_64 #1 Xen HVM domU
RIP: 0010:[<ffffffffa019dc6a>] [<ffffffffa019dc6a>] dsa_slim_input+0x7a/0xc90 [dsa_filter]
IP: [] dsa_slim_input+0x7a/0xc90 [dsa_filter]의 의미는 다음과 같음. –> 참고
항목 | 의미 |
---|---|
IP | Instruction Pointer |
Crash function | dsa_slim_input |
Crash offset | 0x7a (10진수 122) |
End | 0xc90 |
dsa_filter를 구글링 하니 Deep Security 9.0 사용자 / 관리자 가이드가 검색됨.
dsa_slim_input 함수의 122번째 offset을 disassemble 함.
crash> dis dsa_slim_input+122
0xffffffffa019dc6a <dsa_slim_input+122>: sub 0x8(%rdi),%r11d
crash>
결국 sub 0x8(%rdi),%r11d 명령을 수행하던 중 BUG: unable to handle kernel NULL pointer dereference at 0000000000000008가 발생한 것임. 이 부분만 봐서는 왜 NULL pointer dereference가 발생했는지 잘 모르겠음. 1-(1)에서 dsa_slim_input 함수 전체를 disassemble 해보겠음.
1-(1). dsa_slim_input 함수를 디스어셈블
crash> dis 0xffffffff8100ad40 0xffffffff8100ad40 : retq
dsa_slim_input 함수를 디스어셈블.
crash> dis dsa_slim_input
0xffffffffa019dbf0 <dsa_slim_input>: push %rbp
0xffffffffa019dbf1 <dsa_slim_input+1>: mov %rsp,%rbp ----> 스택 프레임 생성.
0xffffffffa019dbf4 <dsa_slim_input+4>: push %r15 ----> rbx, rbp, rdi, rsi, r12-r15 are nonvolatile.
0xffffffffa019dbf6 <dsa_slim_input+6>: push %r14 ----> r14는 레지스터. r14 레지스터에 저장된 값을 스택에 저장함.
0xffffffffa019dbf8 <dsa_slim_input+8>: push %r13
0xffffffffa019dbfa <dsa_slim_input+10>: push %r12
0xffffffffa019dbfc <dsa_slim_input+12>: push %rbx
0xffffffffa019dbfd <dsa_slim_input+13>: sub $0x58,%rsp ----> 0x58은 10진수 88. 88바이트를 메모리에 할당함.
0xffffffffa019dc01 <dsa_slim_input+17>: callq 0xffffffff8100ad40 <mcount> ----> retq (리턴을 의미함)
0xffffffffa019dc06 <dsa_slim_input+22>: mov %rsi,-0x48(%rbp) ----> rsi 레지스터에 저장된 값을 'rbp 레지스터 주소 - 72바이트(0x48)' 가 가 리키는 주소에 저장함. rsi(복사할 데이터의 source(출발지 주소))와 rdi(데이터 복사의 destination(도착지 주소))는 한 쌍으로 주로 쓰임.
0xffffffffa019dc0a <dsa_slim_input+26>: mov 0xa8(%rsi),%rax ----> rsi 레지스터에 저장된 주소에서 168바이트(0xa8) 만큼 떨어진 곳의 값을 rax 레지스터에 저장함.
0xffffffffa019dc11 <dsa_slim_input+33>: mov %rdi,%r12 ----> rdi 레지스터에 저장된 값을 r12 레지스터에 옮김.
0xffffffffa019dc14 <dsa_slim_input+36>: mov 0x8(%rax),%r13d ----> r13d는 r13 레지스터의 하위 32비트를 의미함.
0xffffffffa019dc18 <dsa_slim_input+40>: mov 0x4(%rax),%ebx
0xffffffffa019dc1b <dsa_slim_input+43>: bswap %r13d ----> bswap은 바이트 순서를 변경함 (리틀 엔디안 -> 빅 엔디안. and vice versa)
0xffffffffa019dc1e <dsa_slim_input+46>: bswap %ebx
0xffffffffa019dc20 <dsa_slim_input+48>: testb $0x2,0xd(%rax) ----> testb 명령어는 비트 연산 and를 수행한다.
+ 0xffffffffa019dc24 <dsa_slim_input+52>: je 0xffffffffa019dc60 ----> and 연산을 수행(testb $0x2,0xd(%rax))한 결과가 true이면 0xffffffffa019dc60 주소로 점프함.
(아래에 빨강색으로 표시한 부분은 해석할 필요 없음.)
- 0xffffffffa019dc26 <dsa_slim_input+54>: movzbl 0x54(%rdi),%eax
- 0xffffffffa019dc2a <dsa_slim_input+58>: test $0x1,%al
- 0xffffffffa019dc2c <dsa_slim_input+60>: jne 0xffffffffa019e170
- 0xffffffffa019dc32 <dsa_slim_input+66>: add $0x1,%ebx
- 0xffffffffa019dc35 <dsa_slim_input+69>: or $0x1,%eax
- 0xffffffffa019dc38 <dsa_slim_input+72>: mov %ebx,(%rdi)
- 0xffffffffa019dc3a <dsa_slim_input+74>: mov %ebx,0x4(%rdi)
- 0xffffffffa019dc3d <dsa_slim_input+77>: mov %al,0x54(%rdi)
- 0xffffffffa019dc40 <dsa_slim_input+80>: movl $0x0,-0x4c(%rbp)
- 0xffffffffa019dc47 <dsa_slim_input+87>: mov -0x4c(%rbp),%eax
- 0xffffffffa019dc4a <dsa_slim_input+90>: add $0x58,%rsp
- 0xffffffffa019dc4e <dsa_slim_input+94>: pop %rbx
- 0xffffffffa019dc4f <dsa_slim_input+95>: pop %r12
- 0xffffffffa019dc51 <dsa_slim_input+97>: pop %r13
- 0xffffffffa019dc53 <dsa_slim_input+99>: pop %r14
- 0xffffffffa019dc55 <dsa_slim_input+101>: pop %r15
- 0xffffffffa019dc57 <dsa_slim_input+103>: leaveq
- 0xffffffffa019dc58 <dsa_slim_input+104>: retq
- 0xffffffffa019dc59 <dsa_slim_input+105>: nopl 0x0(%rax)
+ 0xffffffffa019dc60 <dsa_slim_input+112>: mov -0x48(%rbp),%rax ----> 여기로 점프함.
0xffffffffa019dc64 <dsa_slim_input+116>: mov %r13d,%r11d
0xffffffffa019dc67 <dsa_slim_input+119>: mov %ebx,%r10d
+ 0xffffffffa019dc6a <dsa_slim_input+122>: sub 0x8(%rdi),%r11d ----> 문제 발생!!!!!! (unable to handle kernel NULL pointer dereference at 0000000000000008)
(아래에 빨강색으로 표시한 부분은 해석할 필요 없음.)
- 0xffffffffa019dc6e <dsa_slim_input+126>: sub (%rdi),%r10d
- 0xffffffffa019dc71 <dsa_slim_input+129>: cmpl $0x2,0x5ee68(%rip) # 0xffffffffa01fcae0
- 0xffffffffa019dc78 <dsa_slim_input+136>: mov 0xa4(%rax),%r14d
- 0xffffffffa019dc7f <dsa_slim_input+143>: mov 0x58(%rdi),%rax
- 0xffffffffa019dc83 <dsa_slim_input+147>: mov 0xc30(%rax),%r15
- 0xffffffffa019dc8a <dsa_slim_input+154>: jg 0xffffffffa019e180
- 0xffffffffa019dc90 <dsa_slim_input+160>: cmpq $0x0,0x8(%r15)
- 0xffffffffa019dc95 <dsa_slim_input+165>: je 0xffffffffa019dd1f
- 0xffffffffa019dc9b <dsa_slim_input+171>: mov 0x10(%r15),%rdx
- 0xffffffffa019dc9f <dsa_slim_input+175>: movzwl 0xda(%rdx),%eax
- 0xffffffffa019dca6 <dsa_slim_input+182>: mov %rax,%rcx
- 0xffffffffa019dca9 <dsa_slim_input+185>: add $0x1,%eax
- 0xffffffffa019dcac <dsa_slim_input+188>: and $0x1f,%ecx
- 0xffffffffa019dcaf <dsa_slim_input+191>: mov %ax,0xda(%rdx)
- 0xffffffffa019dcb6 <dsa_slim_input+198>: mov %rcx,%rax
- 0xffffffffa019dcb9 <dsa_slim_input+201>: shl $0x5,%rax
- 0xffffffffa019dcbd <dsa_slim_input+205>: lea 0x4d0(%rdx,%rax,1),%rsi
- 0xffffffffa019dcc5 <dsa_slim_input+213>: mov 0x10(%r15),%rax
- 0xffffffffa019dcc9 <dsa_slim_input+217>: movzwl 0xd8(%rax),%eax
- 0xffffffffa019dcd0 <dsa_slim_input+224>: mov %ebx,0x10(%rsi)
- 0xffffffffa019dcd3 <dsa_slim_input+227>: shl $0x14,%eax
- 0xffffffffa019dcd6 <dsa_slim_input+230>: or $0x10003,%eax
- 0xffffffffa019dcdb <dsa_slim_input+235>: mov %eax,0xc(%rsi)
- 0xffffffffa019dcde <dsa_slim_input+238>: mov %rcx,%rax
- 0xffffffffa019dce1 <dsa_slim_input+241>: add $0x27,%rcx
- 0xffffffffa019dce5 <dsa_slim_input+245>: shl $0x5,%rax
- 0xffffffffa019dce9 <dsa_slim_input+249>: shl $0x5,%rcx
(..생략..)
1-(2). 결론
2016년 06월 16일 20시 18분에 K의 VM에서 비정상적인 Shutdown이 발생한 원인은 트렌드마이크로 사의 통합 서버 보안 솔루션인 Deep Security의 dsa_slim_input 함수에서 NULL pointer dereference가 발생한 것으로 추정됨.
crash> dis dsa_slim_input+122
0xffffffffa019dc6a <dsa_slim_input+122>: sub 0x8(%rdi),%r11d
crash>
sub 0x8(%rdi),%r11d 명령의 의미는 r11 레지스터의 하위 32비트 -= rdi 레지스터에서 offset 8만큼 떨어진 주소 값의 데이터
항목 명령어 비고 source 0x8(%rdi) 포인터를 의미함 destination %r11d r11 레지스터의 하위 32비트
즉, r11 레지스터의 하위 32비트(%r11d)에서 rdi 레지스터를 기준으로 offset 8만큼 떨어진 주소 값의 데이터를 뺄셈 한 값을 r11 레지스터의 하위 32비트에 저장함.
하지만 0x8(%rdi)의 주소에는 NULL 값이 있기 때문에 BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 메시지가 발생함.
0000000000000008은 rdi 레지스터에서 오프셋 8(0x8)만큼 떨어진 곳을 의미함.
1) 문제 발생 원인 분석
다른 함수 혹은 외부적인 요인으로 0x8(%rdi) 주소가 가리키는 값(포인터)이 null로 초기화된 것으로 보임. -> 즉, 실행의 흐름이 sub 0x8(%rdi),%r11d에 도착했을 때 0x8(%rdi)은 null을 가리킴.
이 때문에 sub 연산을 수행할 수 없으므로 BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 메시지를 출력한 것으로 판단됨.
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 메시지가 발생하기 전까지 실행 흐름을 따라가보면 rdi 레지스터의 값을 r12 레지스터에 옮길 때는 문제가 발생하지 않았음. (0xffffffffa019dc11 : mov %rdi,%r12)
2) 문제 해결 방법
- dsa_slim_input 함수 내부에서 null pointer 예외 처리를 해야 함.
- dsa_slim_input 함수 내부의 0x8(%rdi)가 가리키는 값이 왜 null로 바뀌는지 실제 프로그램 소스를 보고 확인해야 함. (버그로 추정됨)
3) dsa_slim_input 함수의 실행 흐름
dsa_slim_input 함수 실행 시 ~ BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 메시지가 발생하기 전(sub 0x8(%rdi),%r11d) 까지의 실행 흐름임.
순서 | 명령 | 설명 | 비고 |
---|---|---|---|
1 | 스택 프레임 생성 | push %rbp, mov %rsp,%rb p | dsa_sli m_input 함수가 호출됨 |
2 | 레지스터 값 저장 |
예) push %r15 | |
3 | 값 복사 | esi, edi 레지스터 이용, mov 명령어 | |
4 | 바이트 순서 변경 ( 리틀 엔디안 <—> 빅 엔디안 ) | 리틀 엔디안은 빅 엔디안으로 변경. 빅 엔디안은 리틀 엔디안으로 변경. |
네트워크 에서 데이터를 전송할 때는 Byte order를 빅 엔디안 방식으로 통일한다 이 때문에 바이트 순서를 변경하는 것으로 보임 |
5 | if (2 & 0xd(%ra x)) | 2와 0xd(%r ax)를 비트 연산(AND) 해서 True이면 0xffffff ffa019dc 60으로 점프함 (True로 동작함) |
상수 2와 레지스터 의 값을 비트 연산하는 것으로 보아 특정 패킷을 판별하는 것으로 보임 |
5-(1) | mov -0x48(% rbp),%r ax | move | |
5-(2) | mov %r13d,% r11d | move | |
5-(3) | mov %ebx,%r 10d | move | |
5-(4) | sub 0x8(%rd i),%r11 d | BUG: unable to handle kernel NULL pointer derefere nce at 00000000 00000008 발생함 |
4) dsa_slim_input 함수의 기능
dsa_slim_input 함수의 기능은 다음과 같이 추측됨.
순서 | 기능 | 비고 |
---|---|---|
1 | K 서버가 받아들이는 패킷의 네트워크 Byte order를 빅 엔디안으로 변경 | |
2 | 비트 연산을 통해 특정 패킷을 판별함 | if..else 진입 |
2-(1) | 특정 패킷을 판별하는 조건문이 참일 경우 해당 명령을 실행. | |
2-(2) | 특정 패킷을 판별하는 조건문이 거짓일 경우 해당 명령을 실행. |
dsa_slim_input 함수의 실행 흐름, 기능을 미루어볼 때 crash 쉘에서 log 명령으로 보였던 아래 log는 Heartbeat가 아니라 DSA(Deep Security Agent)의 Deep Security 기능으로 판단됨.
(..생략..) net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 4014915650 != tcph seq 4014915757 | tb_tcpv6_conn.c:927 net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 4014915650 != tcph seq 4014915905 | tb_tcpv6_conn.c:927 net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 4014915650 != tcph seq 4014915905 | tb_tcpv6_conn.c:927 net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 4014915650 != tcph seq 4014915916 | tb_tcpv6_conn.c:927 net.tcp/1 | Unexpected pkt out in establising_1 out(local set=1) ; local.start_seq 4014915650 != tcph seq 4014915916 | tb_tcpv6_conn.c:927 (..생략..)Deep Security의 버전 확인 후 해당 버전의 Deep Security 매뉴얼을 확인해 봐야함. 매뉴얼에 IPv6에 대한 내용이 언급됨. Deep Security가 java 기반의 서비스를 제공하는지 확인 해야함.