RCTF Valgrind
Valgrind
给的是 angr.SimState 对象在执行过程中输出的日志。
由于网上资料也比较多,或许有些杂乱,我也做一些介绍。让大家也能快速上手。暂时就探索到这,有空再看看源码,再深入一些。
Angr : A powerful and user-friendly binary analysis platform!
一些介绍
配置 angr 环境 https://docs.angr.io/introductory-errata/install
## 对于我的 wsl2
git clone https://github.com/angr/angr-dev
cd angr-dev
./setup.sh -i -e angr
## 要是有包下不来,挂个代理,或者手动下,都可以
## 接下来要配置 ~/.bashrc
export WORK_HOME = ~/.vituralenvs
source $(find / -name "virtualenvwrapper.sh")
## 这个也可以手动设置路径
## 发现 angr venv 中导入不了 angr
## 可以在 angr venv 中使用 pip install -e angr
## pip 的 -e 参数,还不大清楚,简单的说似乎是路径选择
由浅入深。
我们先来用 angr 来载入一下程序。
In [1]: import monkeyhex,cle,pyvex
In [2]: ld = cle.Loader("/bin/true")
In [3]: data = ld.memory.load(ld.main_object.entry,0x100)
In [4]: data
Out[4]: b'1\xedI\x89\xd1^H\x89\xe2H\x83\xe4\xf0PTL\x8d\x05*2\x00\x00H\x8d\r\xb31\x00\x00H\x8d=\x1c\xff\xff\xff\xff\x15\xfeW \x00\xf4\x0f\x1fD\x00\x00H\x8d=\x99X \x00UH\x8d\x05\x91X \x00H9\xf8H\x89\xe5t\x19H\x8b\x05\xd2W \x00H\x85\xc0t\r]\xff\xe0f.\x0f\x1f\x84\x00\x00\x00\x00\x00]\xc3\x0f\x1f@\x00f.\x0f\x1f\x84\x00\x00\x00\x00\x00H\x8d=YX \x00H\x8d5RX \x00UH)\xfeH\x89\xe5H\xc1\xfe\x03H\x89\xf0H\xc1\xe8?H\x01\xc6H\xd1\xfet\x18H\x8b\x05\x99W \x00H\x85\xc0t\x0c]\xff\xe0f\x0f\x1f\x84\x00\x00\x00\x00\x00]\xc3\x0f\x1f@\x00f.\x0f\x1f\x84\x00\x00\x00\x00\x00\x80=1X \x00\x00u/H\x83=oW \x00\x00UH\x89\xe5t\x0cH\x8b=zW \x00\xe8M\xfe\xff\xff\xe8H\xff\xff\xff\xc6\x05\tX \x00\x01]\xc3\x0f\x1f\x80\x00\x00\x00\x00\xf3\xc3f\x0f\x1fD\x00\x00'
第一步,加载。此时二进制文件已经通过 cle 加载到到内存了。这里暂时不提动态链接器,当然动态链接器可以存在,没了的话一些跳转的函数就不能到正确的位置了。
In [5]: irsb = pyvex.lift(data,ld.main_object.entry,ld.main_object.arch)
In [6]: irsb
Out[6]: IRSB <0x2a bytes, 11 ins., <Arch AMD64 (LE)>> at 0x4017b0
In [7]: irsb.pp()
IRSB {
t0:Ity_I32 t1:Ity_I32 t2:Ity_I32 t3:Ity_I64 t4:Ity_I64 t5:Ity_I64 t6:Ity_I64 t7:Ity_I64 t8:Ity_I64 t9:Ity_I64 t10:Ity_I64 t11:Ity_I64 t12:Ity_I64 t13:Ity_I64 t14:Ity_I64 t15:Ity_I32 t16:Ity_I64 t17:Ity_I64 t18:Ity_I64 t19:Ity_I64 t20:Ity_I32 t21:Ity_I64 t22:Ity_I32 t23:Ity_I64 t24:Ity_I64 t25:Ity_I64 t26:Ity_I64 t27:Ity_I64 t28:Ity_I64 t29:Ity_I64 t30:Ity_I64 t31:Ity_I64 t32:Ity_I64 t33:Ity_I64 t34:Ity_I64 t35:Ity_I64 t36:Ity_I64
00 | ------ IMark(0x4017b0, 2, 0) ------
01 | PUT(rbp) = 0x0000000000000000
02 | ------ IMark(0x4017b2, 3, 0) ------
03 | t26 = GET:I64(rdx)
04 | PUT(r9) = t26
05 | PUT(rip) = 0x00000000004017b5
06 | ------ IMark(0x4017b5, 1, 0) ------
07 | t4 = GET:I64(rsp)
08 | t3 = LDle:I64(t4)
09 | t27 = Add64(t4,0x0000000000000008)
10 | PUT(rsi) = t3
11 | ------ IMark(0x4017b6, 3, 0) ------
12 | PUT(rdx) = t27
13 | ------ IMark(0x4017b9, 4, 0) ------
14 | t5 = And64(t27,0xfffffffffffffff0)
15 | PUT(cc_op) = 0x0000000000000014
16 | PUT(cc_dep1) = t5
17 | PUT(cc_dep2) = 0x0000000000000000
18 | PUT(rip) = 0x00000000004017bd
19 | ------ IMark(0x4017bd, 1, 0) ------
20 | t8 = GET:I64(rax)
21 | t29 = Sub64(t5,0x0000000000000008)
22 | PUT(rsp) = t29
23 | STle(t29) = t8
24 | PUT(rip) = 0x00000000004017be
25 | ------ IMark(0x4017be, 1, 0) ------
26 | t31 = Sub64(t29,0x0000000000000008)
27 | PUT(rsp) = t31
28 | STle(t31) = t29
29 | ------ IMark(0x4017bf, 7, 0) ------
30 | PUT(r8) = 0x00000000004049f0
31 | ------ IMark(0x4017c6, 7, 0) ------
32 | PUT(rcx) = 0x0000000000404980
33 | ------ IMark(0x4017cd, 7, 0) ------
34 | PUT(rdi) = 0x00000000004016f0
35 | PUT(rip) = 0x00000000004017d4
36 | ------ IMark(0x4017d4, 6, 0) ------
37 | t17 = LDle:I64(0x0000000000606fd8)
38 | t33 = Sub64(t31,0x0000000000000008)
39 | PUT(rsp) = t33
40 | STle(t33) = 0x00000000004017da
41 | t35 = Sub64(t33,0x0000000000000080)
42 | ====== AbiHint(0xt35, 128, t17) ======
NEXT: PUT(rip) = t17; Ijk_Call
}
我们把这块数据,通过 lift 函数把他变成一块 irsb。这块 irsb 这里我把它称为 IR Block, 就是 VEX IR 的基本块,动态执行的时候为了兼容多个架构,angr 就会执行这一块,执行又会涉及到另外一个模块。
这里之所以要使用 data,addr,arch。可以看到每一段 IMark 都有表示 (地址,汇编长度,不知道是什么的一个变量)
还有另外一个 Block,是由 capstone 模块提供,而更底层我也没有挖掘过,只好使用 angr 集成的 Capstone 了。虽说它是反汇编模块,但是似乎没有提供直接的接口,我想是应该有个提供 data 或者 addr size 和 arch 输出反汇编的。这个我还得找找。这里的话我就按部就班的来了。
In [39]: p = angr.Project("/bin/true")
WARNING | 2021-09-17 23:29:36,414 | cle.loader | The main binary is a position-independent executable. It is being loaded with a base address of 0x400000.
In [40]: st = p.factory.blank_state(addr = ld.main_object.entry)
In [41]: st.addr
Out[41]: 0x4017b0
In [42]: bb = st.block()
In [43]: bb.pp()
0x4017b0: xor ebp, ebp
0x4017b2: mov r9, rdx
0x4017b5: pop rsi
0x4017b6: mov rdx, rsp
0x4017b9: and rsp, 0xfffffffffffffff0
0x4017bd: push rax
0x4017be: push rsp
0x4017bf: lea r8, [rip + 0x322a]
0x4017c6: lea rcx, [rip + 0x31b3]
0x4017cd: lea rdi, [rip - 0xe4]
0x4017d4: call qword ptr [rip + 0x2057fe]
接下来就是汇编以及一些可以用于解析的参数,略作展示。
In [46]: bb.capstone
Out[46]: <DisassemblerBlock for 0x4017b0>
In [47]: bb.capstone.pp()
0x4017b0: xor ebp, ebp
0x4017b2: mov r9, rdx
0x4017b5: pop rsi
0x4017b6: mov rdx, rsp
0x4017b9: and rsp, 0xfffffffffffffff0
0x4017bd: push rax
0x4017be: push rsp
0x4017bf: lea r8, [rip + 0x322a]
0x4017c6: lea rcx, [rip + 0x31b3]
0x4017cd: lea rdi, [rip - 0xe4]
0x4017d4: call qword ptr [rip + 0x2057fe]
In [48]: bb.capstone.insns
Out[48]:
[<DisassemblerInsn "xor" for 0x4017b0>,
<DisassemblerInsn "mov" for 0x4017b2>,
<DisassemblerInsn "pop" for 0x4017b5>,
<DisassemblerInsn "mov" for 0x4017b6>,
<DisassemblerInsn "and" for 0x4017b9>,
<DisassemblerInsn "push" for 0x4017bd>,
<DisassemblerInsn "push" for 0x4017be>,
<DisassemblerInsn "lea" for 0x4017bf>,
<DisassemblerInsn "lea" for 0x4017c6>,
<DisassemblerInsn "lea" for 0x4017cd>,
<DisassemblerInsn "call" for 0x4017d4>]
In [49]: bb.capstone.insns[0].op_str
Out[49]: 'ebp, ebp'
In [50]: bb.capstone.insns[0].insn
Out[50]: <CsInsn 0x4017b0 [31ed]: xor ebp, ebp>
In [51]: bb.capstone.insns[0].size
Out[51]: 0x2
至此一小部分的静态分析的模块就介绍好了。
接下来就可以直接做题啦233
我们看回标题,题目要求的就是分析 一堆 IRSB 俺就叫他 SB 好了(名词大师)而且我们也能比较清楚的看到每个 block 尾部都有跳转的地址,也可能有跳转的一些条件。block 的头部有一些寄存器(大概吧,应该可以称作寄存器)
IRSB {
t0:Ity_I32 t1:Ity_I32 t2:Ity_I32 t3:Ity_I32 t4:Ity_I32 t5:Ity_I32 t6:Ity_I32
00 | ------ IMark(0x8048464, 3, 0) ------
01 | t2 = GET:I32(esp)
02 | t0 = Sub32(t2,0x00000054)
03 | PUT(cc_op) = 0x00000006
04 | PUT(cc_dep1) = t2
05 | PUT(cc_dep2) = 0x00000054
06 | PUT(cc_ndep) = 0x00000000
07 | PUT(eip) = 0x08048467
08 | ------ IMark(0x8048467, 5, 0) ------
09 | t4 = Sub32(t0,0x00000004)
10 | PUT(esp) = t4
11 | STle(t4) = 0x0804846c
NEXT: PUT(eip) = 0x080485f1; Ijk_Call
}
中间的每一段 IMark 就表示一段汇编转换成的 vex ir ,虽然但是,这个和他的汇编其实不等价,我们应该从一整个 blcok 来看他的 ir 或者 他的 汇编。这样才能理解他在干什么。
有关 vex ir 指令的一些注释,还是可以食用的。
怎么产生的这一段,我们就可以看回之前调用过的 lift ,而 lift 当然是针对不同架构所建立的,也就是 lift 函数会被不同的架构所需要的 lift 重写。
题目很长,给了 4k 行,这其实不是很利于线性的分析。于是乎我们可以写一段 py 脚本提取一下。
import re
import IPython
with open("./oldlog.log","r") as file1:
data = file1.read()
file1.close()
target = "IMark\(([\w]*), \d, \d\)"
res = re.findall(target,data)
addr_int_list = [int(x[2:],16) for x in list(set(res))]
addr_int_list.sort()
addr_str_list = [hex(x) for x in addr_int_list]
addr_code = {}
for addr in addr_str_list:
target1 = '------ IMark\(%s, \d{1,3}, \d{1,3}\) ------\n([\w\W]*?)\n\}'%addr
target2 = '------ IMark\(%s, \d{1,3}, \d{1,3}\) ------\n([\w\W]*?) \d{1,3} \| ------ IMark'%addr
code1 = re.search(target1,data)
code2 = re.search(target2,data)
if code1 != None and code2 != None:
addr_code[addr] = code1.group(1) if len(code1.group(1))<len(code2.group(1)) else code2.group(1)
elif code2 == None:
addr_code[addr] = code1.group(1)
elif code1 == None:
addr_code[addr] = code2.group(1)
with open("./newlog.log","w") as file2:
for addr in addr_str_list:
if(addr == '0x8048554' or addr == '0x8048552'):
print(addr+'\n')
print(addr_code[addr])
file2.writelines(addr+'\n')
if(addr_code[addr] != '' and addr_code[addr][-1] != '\n'):
file2.writelines(addr_code[addr]+'\n')
else:
file2.writelines(addr_code[addr])
file2.close()
IPython.embed()
得到了如下 irsb 集合,不想看的话就可以忽略了233,直接跳转到下面
0x8048464
01 | t2 = GET:I32(esp)
02 | t0 = Sub32(t2,0x00000054)
03 | PUT(cc_op) = 0x00000006
04 | PUT(cc_dep1) = t2
05 | PUT(cc_dep2) = 0x00000054
06 | PUT(cc_ndep) = 0x00000000
07 | PUT(eip) = 0x08048467
0x8048467
09 | t4 = Sub32(t0,0x00000004)
10 | PUT(esp) = t4
11 | STle(t4) = 0x0804846c # Write a value to memory
NEXT: PUT(eip) = 0x080485f1; Ijk_Call
0x804846c
01 | t0 = GET:I32(eax)
02 | t2 = Add32(t0,0x00001b94)
03 | PUT(cc_op) = 0x00000003
04 | PUT(cc_dep1) = t0
05 | PUT(cc_dep2) = 0x00001b94
06 | PUT(cc_ndep) = 0x00000000
07 | PUT(eax) = t2
08 | PUT(eip) = 0x08048471
0x8048471
10 | t51 = GET:I16(gs)
11 | t50 = 16Uto32(t51)
12 | t5 = GET:I64(ldt)
13 | t6 = GET:I64(gdt)
14 | t52 = x86g_use_seg_selector(t5,t6,t50,0x00000014):Ity_I64
15 | t54 = 64HIto32(t52)
16 | t53 = CmpNE32(t54,0x00000000)
17 | if (t53) { PUT(eip) = 0x8048471; Ijk_MapFail }
18 | t55 = 64to32(t52)
19 | t56 = LDle:I32(t55)
20 | PUT(eip) = 0x08048477
0x8048477
22 | t58 = GET:I32(ebp)
23 | t57 = Add32(t58,0xfffffff4)
24 | STle(t57) = t56
0x804847a
26 | PUT(cc_op) = 0x0000000f
27 | PUT(cc_dep1) = 0x00000000
28 | PUT(cc_dep2) = 0x00000000
29 | PUT(cc_ndep) = 0x00000000
30 | PUT(eax) = 0x00000000
31 | PUT(eip) = 0x0804847c
0x804847c
33 | t60 = Add32(t58,0xffffffce)
34 | STle(t60) = 0x656d3174
35 | PUT(eip) = 0x08048483
0x8048483
37 | t62 = Add32(t58,0xffffffd2)
38 | STle(t62) = 0x7530795f
39 | PUT(eip) = 0x0804848a
0x804848a
41 | t64 = Add32(t58,0xffffffd6)
42 | STle(t64) = 0x6a6e655f
43 | PUT(eip) = 0x08048491
0x8048491
45 | t66 = Add32(t58,0xffffffda)
46 | STle(t66) = 0x775f7930
47 | PUT(eip) = 0x08048498
0x8048498
49 | t68 = Add32(t58,0xffffffde)
50 | STle(t68) = 0x31743561
51 | PUT(eip) = 0x0804849f
0x804849f
53 | t70 = Add32(t58,0xffffffe2)
54 | STle(t70) = 0x775f676e
55 | PUT(eip) = 0x080484a6
0x80484a6
57 | t72 = Add32(t58,0xffffffe6)
58 | STle(t72) = 0x6e5f3561
59 | PUT(eip) = 0x080484ad
0x80484ad
61 | t74 = Add32(t58,0xffffffea)
62 | STle(t74) = 0x775f746f
63 | PUT(eip) = 0x080484b4
0x80484b4
65 | t76 = Add32(t58,0xffffffee)
66 | STle(t76) = 0x65743561
67 | PUT(eip) = 0x080484bb
0x80484bb
69 | t78 = Add32(t58,0xfffffff2)
70 | STle(t78) = 0x0064
71 | PUT(eip) = 0x080484c1
0x80484c1
73 | t80 = Add32(t58,0xffffffac)
74 | STle(t80) = 0x00000003
75 | PUT(eip) = 0x080484c8
0x80484c8
77 | t82 = Add32(t58,0xffffffb4)
78 | STle(t82) = 0x41
79 | PUT(eip) = 0x080484cc
0x80484cc
81 | t84 = Add32(t58,0xffffffb5)
82 | STle(t84) = 0x42
83 | PUT(eip) = 0x080484d0
0x80484d0
85 | t86 = Add32(t58,0xffffffb6)
86 | STle(t86) = 0x43
87 | PUT(eip) = 0x080484d4
0x80484d4
89 | t88 = Add32(t58,0xffffffb7)
90 | STle(t88) = 0x44
91 | PUT(eip) = 0x080484d8
0x80484d8
93 | t90 = Add32(t58,0xffffffb8)
94 | STle(t90) = 0x45
95 | PUT(eip) = 0x080484dc
0x80484dc
97 | t92 = Add32(t58,0xffffffb9)
98 | STle(t92) = 0x46
99 | PUT(eip) = 0x080484e0
0x80484e0
101 | t94 = Add32(t58,0xffffffba)
102 | STle(t94) = 0x47
103 | PUT(eip) = 0x080484e4
0x80484e4
105 | t96 = Add32(t58,0xffffffbb)
106 | STle(t96) = 0x48
107 | PUT(eip) = 0x080484e8
0x80484e8
109 | t98 = Add32(t58,0xffffffbc)
110 | STle(t98) = 0x49
111 | PUT(eip) = 0x080484ec
0x80484ec
113 | t100 = Add32(t58,0xffffffbd)
114 | STle(t100) = 0x4a
115 | PUT(eip) = 0x080484f0
0x80484f0
117 | t102 = Add32(t58,0xffffffbe)
118 | STle(t102) = 0x4b
119 | PUT(eip) = 0x080484f4
0x80484f4
121 | t104 = Add32(t58,0xffffffbf)
122 | STle(t104) = 0x4c
123 | PUT(eip) = 0x080484f8
0x80484f8
125 | t106 = Add32(t58,0xffffffc0)
126 | STle(t106) = 0x4d
127 | PUT(eip) = 0x080484fc
0x80484fc
129 | t108 = Add32(t58,0xffffffc1)
130 | STle(t108) = 0x4e
131 | PUT(eip) = 0x08048500
0x8048500
133 | t110 = Add32(t58,0xffffffc2)
134 | STle(t110) = 0x4f
135 | PUT(eip) = 0x08048504
0x8048504
137 | t112 = Add32(t58,0xffffffc3)
138 | STle(t112) = 0x50
139 | PUT(eip) = 0x08048508
0x8048508
141 | t114 = Add32(t58,0xffffffc4)
142 | STle(t114) = 0x51
143 | PUT(eip) = 0x0804850c
0x804850c
145 | t116 = Add32(t58,0xffffffc5)
146 | STle(t116) = 0x52
147 | PUT(eip) = 0x08048510
0x8048510
149 | t118 = Add32(t58,0xffffffc6)
150 | STle(t118) = 0x53
151 | PUT(eip) = 0x08048514
0x8048514
153 | t120 = Add32(t58,0xffffffc7)
154 | STle(t120) = 0x54
155 | PUT(eip) = 0x08048518
0x8048518
157 | t122 = Add32(t58,0xffffffc8)
158 | STle(t122) = 0x55
159 | PUT(eip) = 0x0804851c
0x804851c
161 | t124 = Add32(t58,0xffffffc9)
162 | STle(t124) = 0x56
163 | PUT(eip) = 0x08048520
0x8048520
165 | t126 = Add32(t58,0xffffffca)
166 | STle(t126) = 0x57
167 | PUT(eip) = 0x08048524
0x8048524
169 | t128 = Add32(t58,0xffffffcb)
170 | STle(t128) = 0x58
171 | PUT(eip) = 0x08048528
0x8048528
173 | t130 = Add32(t58,0xffffffcc)
174 | STle(t130) = 0x59
175 | PUT(eip) = 0x0804852c
0x804852c
177 | t132 = Add32(t58,0xffffffcd)
178 | STle(t132) = 0x5a
179 | PUT(eip) = 0x08048530
0x8048530
181 | t134 = Add32(t58,0xffffffa8)
182 | STle(t134) = 0x00000000
0x8048537
NEXT: PUT(eip) = 0x080485c8; Ijk_Boring
0x804853c
01 | t14 = GET:I32(ebp)
02 | t13 = Add32(t14,0xffffffce)
03 | PUT(eip) = 0x0804853f
0x804853f
05 | t15 = Add32(t14,0xffffffa8)
06 | t17 = LDle:I32(t15)
0x8048542
08 | t2 = Add32(t17,t13)
09 | PUT(eip) = 0x08048544
0x8048544
11 | t19 = LDle:I8(t2)
12 | t32 = 8Uto32(t19)
13 | t18 = t32
14 | PUT(eax) = t18
0x8048547
16 | t21 = GET:I8(al)
17 | t33 = 8Sto32(t21)
18 | t20 = t33
19 | PUT(edx) = t20
20 | PUT(eip) = 0x0804854a
0x804854a
22 | t22 = Add32(t14,0xffffffac)
23 | t24 = LDle:I32(t22)
0x804854d
25 | t7 = Add32(t24,t20)
26 | PUT(eax) = t7
0x804854f
28 | PUT(cc_op) = 0x00000006
29 | PUT(cc_dep1) = t7
30 | PUT(cc_dep2) = 0x0000005a
31 | PUT(cc_ndep) = 0x00000000
32 | PUT(eip) = 0x08048552
0x8048552
34 | t35 = CmpLE32S(t7,0x0000005a)
35 | t34 = 1Uto32(t35)
36 | t30 = t34
37 | t36 = 32to1(t30)
38 | t25 = t36
39 | if (t25) { PUT(eip) = 0x8048554; Ijk_Boring }
NEXT: PUT(eip) = 0x08048574; Ijk_Boring
0x8048554
01 | t17 = GET:I32(ebp)
02 | t16 = Add32(t17,0xffffffce)
03 | PUT(eip) = 0x08048557
0x8048557
05 | t18 = Add32(t17,0xffffffa8)
06 | t20 = LDle:I32(t18)
0x804855a
08 | t2 = Add32(t20,t16)
09 | PUT(eip) = 0x0804855c
0x804855c
11 | t22 = LDle:I8(t2)
12 | t21 = 8Uto32(t22)
0x804855f
14 | PUT(eip) = 0x08048561
0x8048561
16 | t24 = Add32(t17,0xffffffac)
17 | t26 = LDle:I32(t24)
0x8048564
19 | t7 = Add32(t26,t21)
0x8048566
21 | PUT(ecx) = t7
0x8048568
23 | t28 = Add32(t17,0xffffffce)
24 | PUT(edx) = t28
25 | PUT(eip) = 0x0804856b
0x804856b
27 | t30 = Add32(t17,0xffffffa8)
28 | t32 = LDle:I32(t30)
0x804856e
30 | t12 = Add32(t32,t28)
31 | PUT(cc_op) = 0x00000003
32 | PUT(cc_dep1) = t32
33 | PUT(cc_dep2) = t28
34 | PUT(cc_ndep) = 0x00000000
35 | PUT(eax) = t12
36 | PUT(eip) = 0x08048570
0x8048570
38 | t33 = GET:I8(cl)
39 | STle(t12) = t33
0x8048572
NEXT: PUT(eip) = 0x080485c4; Ijk_Boring
0x8048574
01 | t63 = GET:I32(ebp)
02 | t62 = Add32(t63,0xffffffce)
03 | PUT(eip) = 0x08048577
0x8048577
05 | t64 = Add32(t63,0xffffffa8)
06 | t66 = LDle:I32(t64)
0x804857a
08 | t2 = Add32(t66,t62)
09 | PUT(eip) = 0x0804857c
0x804857c
11 | t68 = LDle:I8(t2)
12 | t140 = 8Uto32(t68)
13 | t67 = t140
14 | PUT(eax) = t67
0x804857f
16 | t70 = GET:I8(al)
17 | t141 = 8Sto32(t70)
18 | t69 = t141
19 | PUT(eip) = 0x08048582
0x8048582
21 | t71 = Add32(t63,0xffffffac)
22 | t73 = LDle:I32(t71)
0x8048585
24 | t7 = Add32(t73,t69)
0x8048587
26 | t74 = Add32(t7,0xffffffa6)
0x804858a
0x804858f
0x8048591
30 | t14 = MullS32(t74,0x4ec4ec4f)
31 | t142 = 64HIto32(t14)
32 | t77 = t142
0x8048593
34 | t20 = Sar32(t77,0x03)
0x8048596
0x8048598
37 | t27 = Sar32(t74,0x1f)
0x804859b
39 | t31 = Sub32(t20,t27)
0x804859d
41 | PUT(eip) = 0x0804859f
0x804859f
43 | t103 = Add32(t63,0xffffffb0)
44 | STle(t103) = t31
45 | PUT(eip) = 0x080485a2
0x80485a2
47 | t106 = Add32(t63,0xffffffb0)
48 | t108 = LDle:I32(t106)
0x80485a5
50 | t38 = Mul32(t108,0x0000001a)
0x80485a8
52 | t39 = Sub32(t74,t38)
0x80485aa
54 | PUT(eip) = 0x080485ac
0x80485ac
56 | t110 = Add32(t63,0xffffffb0)
57 | STle(t110) = t39
58 | PUT(eip) = 0x080485af
0x80485af
60 | t113 = Add32(t63,0xffffffb0)
61 | t115 = LDle:I32(t113)
0x80485b2
63 | t44 = Sub32(t115,0x00000001)
64 | PUT(eip) = 0x080485b5
0x80485b5
66 | t117 = Add32(t63,t44)
67 | t116 = Add32(t117,0xffffffb4)
68 | t122 = LDle:I8(t116)
69 | t143 = 8Uto32(t122)
70 | t121 = t143
71 | PUT(eax) = t121
0x80485ba
73 | t123 = Add32(t63,0xffffffce)
74 | PUT(ecx) = t123
75 | PUT(eip) = 0x080485bd
0x80485bd
77 | t125 = Add32(t63,0xffffffa8)
78 | t127 = LDle:I32(t125)
0x80485c0
80 | t50 = Add32(t127,t123)
81 | PUT(edx) = t50
82 | PUT(eip) = 0x080485c2
0x80485c2
84 | t128 = GET:I8(al)
85 | STle(t50) = t128
86 | PUT(eip) = 0x080485c4
0x80485c4
88 | t129 = Add32(t63,0xffffffa8)
89 | t56 = LDle:I32(t129)
90 | t54 = Add32(t56,0x00000001)
91 | STle(t129) = t54
92 | PUT(eip) = 0x080485c8
0x80485c8
01 | t5 = GET:I32(ebp)
02 | t4 = Add32(t5,0xffffffa8)
03 | t2 = LDle:I32(t4) #A load from memory
04 | PUT(cc_op) = 0x00000006
05 | PUT(cc_dep1) = t2
06 | PUT(cc_dep2) = 0x00000024
07 | PUT(cc_ndep) = 0x00000000
08 | PUT(eip) = 0x080485cc
0x80485cc
10 | t14 = CmpLE32S(t2,0x00000024)
11 | t13 = 1Uto32(t14)
12 | t11 = t13
13 | t15 = 32to1(t11)
14 | t6 = t15
15 | if (t6) { PUT(eip) = 0x804853c; Ijk_Boring }
NEXT: PUT(eip) = 0x080485d2; Ijk_Boring
0x80485d2
01 | PUT(eax) = 0x00000000
02 | PUT(eip) = 0x080485d7
0x80485d7
04 | t10 = GET:I32(ebp)
05 | t9 = Add32(t10,0xfffffff4)
06 | t11 = LDle:I32(t9)
07 | PUT(ecx) = t11
08 | PUT(eip) = 0x080485da
0x80485da
10 | t13 = GET:I16(gs)
11 | t25 = 16Uto32(t13)
12 | t12 = t25
13 | t5 = GET:I64(ldt)
14 | t6 = GET:I64(gdt)
15 | t26 = x86g_use_seg_selector(t5,t6,t12,0x00000014):Ity_I64
16 | t14 = t26
17 | t27 = 64HIto32(t14)
18 | t16 = t27
19 | t15 = CmpNE32(t16,0x00000000)
20 | if (t15) { PUT(eip) = 0x80485da; Ijk_MapFail }
21 | t28 = 64to32(t14)
22 | t17 = t28
23 | t2 = LDle:I32(t17)
24 | t1 = Xor32(t11,t2)
25 | PUT(cc_op) = 0x0000000f
26 | PUT(cc_dep1) = t1
27 | PUT(cc_dep2) = 0x00000000
28 | PUT(cc_ndep) = 0x00000000
29 | PUT(ecx) = t1
30 | PUT(eip) = 0x080485e1
0x80485e1
32 | t30 = CmpEQ32(t1,0x00000000)
33 | t29 = 1Uto32(t30)
34 | t23 = t29
35 | t31 = 32to1(t23)
36 | t18 = t31
37 | if (t18) { PUT(eip) = 0x80485e8; Ijk_Boring }
NEXT: PUT(eip) = 0x080485e3; Ijk_Boring
0x80485e8
01 | t2 = GET:I32(esp)
02 | t0 = Add32(t2,0x00000054)
03 | PUT(cc_op) = 0x00000003
04 | PUT(cc_dep1) = t2
05 | PUT(cc_dep2) = 0x00000054
06 | PUT(cc_ndep) = 0x00000000
07 | PUT(esp) = t0
08 | PUT(eip) = 0x080485eb
0x80485eb
10 | t3 = LDle:I32(t0)
11 | t10 = Add32(t0,0x00000004)
12 | PUT(esp) = t10
13 | PUT(ecx) = t3
14 | PUT(eip) = 0x080485ec
0x80485ec
16 | t5 = LDle:I32(t10)
17 | PUT(ebp) = t5
0x80485ed
19 | t12 = Add32(t3,0xfffffffc)
20 | PUT(esp) = t12
21 | PUT(eip) = 0x080485f0
0x80485f0
23 | t9 = LDle:I32(t12)
24 | t14 = Add32(t12,0x00000004)
25 | PUT(esp) = t14
NEXT: PUT(eip) = t9; Ijk_Ret
0x80485f1
01 | t0 = GET:I32(esp)
02 | t3 = LDle:I32(t0)
03 | PUT(eax) = t3
04 | PUT(eip) = 0x080485f4
0x80485f4
06 | t2 = LDle:I32(t0)
07 | t4 = Add32(t0,0x00000004)
08 | PUT(esp) = t4
NEXT: PUT(eip) = t2; Ijk_Ret
我们可以根据语义,去分析每一段block做了什么,跳转到哪里。
当然,我们也可以倒过来写汇编,毕竟就只有 110 行左右的汇编。
说到底,其实我们就是人工做了语法分析的逻辑。
以下是我写的伪代码。浓缩到110行了。直接看 irsb 也是可以的,甚至我这种人肉反汇编其实是比较浪费时间的。
0x8048464 sub esp,0x58
0x8048467 mov [esp],0x0804846c -> call 0x804846c # 写地址 CALL
0x804846c add eax,0x1b94
0x8048471
0x8048477 mov [ebp-0x0C],[x86g_use_seg_selector(t5,t6,t50,0x00000014)]
0x804847a mov eax,0
0x804847c mov.d [ebp-0x32],0x656d3174
0x8048483 mov.d [ebp-0x2E],0x7530795f
0x804848a mov.d [ebp-0x2A],0x6a6e655f
...
0x80484b4 mov.d [ebp-0x12],0x65743561
0x80484bb mov.w [ebp-0x0E],0x0064
0x80484c1 mov.d [ebp-0x54],0x00000003
0x80484c8 mov.b [ebp-0x4C],0x41
0x80484cc mov.b [ebp-0x4B],0x42
...
0x804852c mov.b [ebp-0x33],0x5a
0x8048530 mov.d [ebp-0x58],0x00000000
0x8048537 jmp 0x80485c8-----------------------------|
0x804853c lea edx,[ebp-0x32]<-------| # add a2,[ebp-0x58],ebp-0x32
0x804853f mov.d eax,[ebp-0x58]
0x8048542 add eax,edx
0x8048544 mov.b eax,[eax]
0x8048547 mov edx,al # get a byte
0x804854a mov eax,[ebp-0x54] # 所有的都会位移3
0x804854d add eax,edx
0x804854f cmp eax,0x0000005a
0x8048552 jle 0x8048554---------------| jmp 0x8048574
0x8048554 lea edx,[ebp-0x32]
0x8048557 mov eax,[ebp-0x58]
0x804855a add edx,eax
0x804855c mov.bb al,[edx]
0x804855f #位移3
0x8048561 mov t26,[ebp-0x54]
0x8048564 add t7,t26,eax
0x8048566 mov ecx,t7
0x8048568 lea edx,[ebp-0x32]
0x804856b
0x804856e mov eax,(edx+[ebp-0x58])
0x8048570 mov [eax],cl
0x8048572 jmp 0x80485c4
|
|
0x8048574 lea edx,[ebp-0x32]<---------|
0x8048577 mov ecx,[ebp-0x58]
0x804857a add ecx,edx
0x804857c mov.b ecx,[ecx]
0x804857f mov edx,al #(al+3-0x5A)*0x4ec4ec4f>>3
0x8048582 mov ecx,[ebp-0x54]
0x8048585 add ecx,edx
0x8048587 sub ecx,0x5A
0x8048591 mul 0x4ec4ec4f # ecx = (I32high)(rcx*0x4ec4ec4f)
0x8048593 shr ecx,0x3
0x8048596 mov edx,[ebp-0x54]
0x8048598 shr edx,0x1f
0x804859b sub edx,eax
0x804859d jmp 0x0804859f
0x804859f [ebp-0x50],edx
0x80485a2 mov eax,[ebp-0x50] #eax=eax*0x0000001a
0x80485a5 mul 0x0000001a
0x80485a8 sub ecx,eax
0x80485aa nop
0x80485ac mov [ebp-0x50],ecx
0x80485af mov ecx,[ebp-0x50]
0x80485b2 sub ecx,0x00000001
0x80485b5 mov.b eax,[ebp+ecx-0x4C] # 根据 ecx 查表
0x80485ba lea ecx,[ebp-0x32]
0x80485bd mov t127,[ebp-0x58]
0x80485c0 mov edx,t127+ecx
0x80485c2 mov.b [edx],eax
0x80485c4 add [ebp-0x58],0x1---------|
0x80485c8 cmp [ebp-0x58],0x24------------| |
0x80485cc jnz 0x804853c ----------------------|
0x80485d2 mov eax,0x0
0x80485d7 mov ecx,[ebp-0x0C]
0x80485da xor ecx,[x86g_use_seg_selector(t5,t6,t12,0x00000014)]
0x80485e1 cmp ecx,0x00; je 0x80485e8;jmp out
0x80485e8 add esp,0x54
0x80485eb mov ecx,[esp] ; add esp,0x04
0x80485ec mov ebp,[esp]
0x80485ed
0x80485f0 ret
...
了解到栈里面不同位置干什么用了之后,可以看到程序的逻辑就是位移以及一段取模
解密脚本
byte_stream = [0x74,0x31,0x6d,0x65,
0x5f,0x79,0x30,0x75,
0x5f,0x65,0x6e,0x6a,
0x30,0x79,0x5f,0x77,
0x61,0x35,0x74,0x31,
0x6e,0x67,0x5f,0x77,
0x61,0x35,0x5f,0x6e,
0x6f,0x74,0x5f,0x77,
0x61,0x35,0x74,0x65,
0x64]
# table = [0x41...0x5a]
table = [0x41 + num for num in range(0,0x5a-0x41+1)]
for i,value in enumerate(byte_stream):
if value + 3 <= 0x0000005a:
byte_stream[i] = value + 3
else:
t74 = 3 + value - 0x5a
mul1 = (t74 * 0x4ec4ec4f) >>32
t20 = mul1 >> 3
t27 = (value + 3 -0x5A) >> 0x1f
mul2 = ((t20 - t27) * (0x1a)) & 0xFFFFFFFF
sub = value + 3 - 0x5A - mul2
byte_stream[i] = table[sub-1]
print(bytes(byte_stream))
# t20 = Sar32(from64HIto32(MullS32(from8to32(ch) + 0x00000003 - 0x5A,0x4ec4ec4f)),0x03)
# t27 = Sar32(from8to32(ch) + 0x00000003 - 0x5A,0x1f)
# load_ch(ebp + Sub32(from8to32(ch) + 0x00000003 - 0x5A,Mul32(Sub32(t20,t27),0x0000001a)) - 0x00000001 - 0x4C)
# 2**35 / 0x4ec4ec4f = 25L