Valgrind

给的是 angr.SimState 对象在执行过程中输出的日志。
由于网上资料也比较多,或许有些杂乱,我也做一些介绍。让大家也能快速上手。暂时就探索到这,有空再看看源码,再深入一些。

Angr : A powerful and user-friendly binary analysis platform!

一些介绍

配置 angr 环境 https://docs.angr.io/introductory-errata/install

## 对于我的 wsl2 
git clone https://github.com/angr/angr-dev
cd angr-dev
./setup.sh -i -e angr
## 要是有包下不来,挂个代理,或者手动下,都可以
## 接下来要配置 ~/.bashrc
export WORK_HOME = ~/.vituralenvs
source $(find / -name "virtualenvwrapper.sh")
## 这个也可以手动设置路径
## 发现 angr venv 中导入不了 angr
## 可以在 angr venv 中使用 pip install -e angr
## pip 的 -e 参数,还不大清楚,简单的说似乎是路径选择

由浅入深。
我们先来用 angr 来载入一下程序。

In [1]: import monkeyhex,cle,pyvex

In [2]: ld = cle.Loader("/bin/true")

In [3]: data = ld.memory.load(ld.main_object.entry,0x100)

In [4]: data
Out[4]: b'1\xedI\x89\xd1^H\x89\xe2H\x83\xe4\xf0PTL\x8d\x05*2\x00\x00H\x8d\r\xb31\x00\x00H\x8d=\x1c\xff\xff\xff\xff\x15\xfeW \x00\xf4\x0f\x1fD\x00\x00H\x8d=\x99X \x00UH\x8d\x05\x91X \x00H9\xf8H\x89\xe5t\x19H\x8b\x05\xd2W \x00H\x85\xc0t\r]\xff\xe0f.\x0f\x1f\x84\x00\x00\x00\x00\x00]\xc3\x0f\x1f@\x00f.\x0f\x1f\x84\x00\x00\x00\x00\x00H\x8d=YX \x00H\x8d5RX \x00UH)\xfeH\x89\xe5H\xc1\xfe\x03H\x89\xf0H\xc1\xe8?H\x01\xc6H\xd1\xfet\x18H\x8b\x05\x99W \x00H\x85\xc0t\x0c]\xff\xe0f\x0f\x1f\x84\x00\x00\x00\x00\x00]\xc3\x0f\x1f@\x00f.\x0f\x1f\x84\x00\x00\x00\x00\x00\x80=1X \x00\x00u/H\x83=oW \x00\x00UH\x89\xe5t\x0cH\x8b=zW \x00\xe8M\xfe\xff\xff\xe8H\xff\xff\xff\xc6\x05\tX \x00\x01]\xc3\x0f\x1f\x80\x00\x00\x00\x00\xf3\xc3f\x0f\x1fD\x00\x00'

第一步,加载。此时二进制文件已经通过 cle 加载到到内存了。这里暂时不提动态链接器,当然动态链接器可以存在,没了的话一些跳转的函数就不能到正确的位置了。

In [5]: irsb = pyvex.lift(data,ld.main_object.entry,ld.main_object.arch)

In [6]: irsb
Out[6]: IRSB <0x2a bytes, 11 ins., <Arch AMD64 (LE)>> at 0x4017b0

In [7]: irsb.pp()
IRSB {
   t0:Ity_I32 t1:Ity_I32 t2:Ity_I32 t3:Ity_I64 t4:Ity_I64 t5:Ity_I64 t6:Ity_I64 t7:Ity_I64 t8:Ity_I64 t9:Ity_I64 t10:Ity_I64 t11:Ity_I64 t12:Ity_I64 t13:Ity_I64 t14:Ity_I64 t15:Ity_I32 t16:Ity_I64 t17:Ity_I64 t18:Ity_I64 t19:Ity_I64 t20:Ity_I32 t21:Ity_I64 t22:Ity_I32 t23:Ity_I64 t24:Ity_I64 t25:Ity_I64 t26:Ity_I64 t27:Ity_I64 t28:Ity_I64 t29:Ity_I64 t30:Ity_I64 t31:Ity_I64 t32:Ity_I64 t33:Ity_I64 t34:Ity_I64 t35:Ity_I64 t36:Ity_I64

   00 | ------ IMark(0x4017b0, 2, 0) ------
   01 | PUT(rbp) = 0x0000000000000000
   02 | ------ IMark(0x4017b2, 3, 0) ------
   03 | t26 = GET:I64(rdx)
   04 | PUT(r9) = t26
   05 | PUT(rip) = 0x00000000004017b5
   06 | ------ IMark(0x4017b5, 1, 0) ------
   07 | t4 = GET:I64(rsp)
   08 | t3 = LDle:I64(t4)
   09 | t27 = Add64(t4,0x0000000000000008)
   10 | PUT(rsi) = t3
   11 | ------ IMark(0x4017b6, 3, 0) ------
   12 | PUT(rdx) = t27
   13 | ------ IMark(0x4017b9, 4, 0) ------
   14 | t5 = And64(t27,0xfffffffffffffff0)
   15 | PUT(cc_op) = 0x0000000000000014
   16 | PUT(cc_dep1) = t5
   17 | PUT(cc_dep2) = 0x0000000000000000
   18 | PUT(rip) = 0x00000000004017bd
   19 | ------ IMark(0x4017bd, 1, 0) ------
   20 | t8 = GET:I64(rax)
   21 | t29 = Sub64(t5,0x0000000000000008)
   22 | PUT(rsp) = t29
   23 | STle(t29) = t8
   24 | PUT(rip) = 0x00000000004017be
   25 | ------ IMark(0x4017be, 1, 0) ------
   26 | t31 = Sub64(t29,0x0000000000000008)
   27 | PUT(rsp) = t31
   28 | STle(t31) = t29
   29 | ------ IMark(0x4017bf, 7, 0) ------
   30 | PUT(r8) = 0x00000000004049f0
   31 | ------ IMark(0x4017c6, 7, 0) ------
   32 | PUT(rcx) = 0x0000000000404980
   33 | ------ IMark(0x4017cd, 7, 0) ------
   34 | PUT(rdi) = 0x00000000004016f0
   35 | PUT(rip) = 0x00000000004017d4
   36 | ------ IMark(0x4017d4, 6, 0) ------
   37 | t17 = LDle:I64(0x0000000000606fd8)
   38 | t33 = Sub64(t31,0x0000000000000008)
   39 | PUT(rsp) = t33
   40 | STle(t33) = 0x00000000004017da
   41 | t35 = Sub64(t33,0x0000000000000080)
   42 | ====== AbiHint(0xt35, 128, t17) ======
   NEXT: PUT(rip) = t17; Ijk_Call
}

我们把这块数据,通过 lift 函数把他变成一块 irsb。这块 irsb 这里我把它称为 IR Block, 就是 VEX IR 的基本块,动态执行的时候为了兼容多个架构,angr 就会执行这一块,执行又会涉及到另外一个模块。

这里之所以要使用 data,addr,arch。可以看到每一段 IMark 都有表示 (地址,汇编长度,不知道是什么的一个变量)

还有另外一个 Block,是由 capstone 模块提供,而更底层我也没有挖掘过,只好使用 angr 集成的 Capstone 了。虽说它是反汇编模块,但是似乎没有提供直接的接口,我想是应该有个提供 data 或者 addr size 和 arch 输出反汇编的。这个我还得找找。这里的话我就按部就班的来了。

In [39]: p = angr.Project("/bin/true")
WARNING | 2021-09-17 23:29:36,414 | cle.loader | The main binary is a position-independent executable. It is being loaded with a base address of 0x400000.
In [40]: st = p.factory.blank_state(addr = ld.main_object.entry)
In [41]: st.addr
Out[41]: 0x4017b0
In [42]: bb = st.block()
In [43]: bb.pp()
0x4017b0:       xor     ebp, ebp
0x4017b2:       mov     r9, rdx
0x4017b5:       pop     rsi
0x4017b6:       mov     rdx, rsp
0x4017b9:       and     rsp, 0xfffffffffffffff0
0x4017bd:       push    rax
0x4017be:       push    rsp
0x4017bf:       lea     r8, [rip + 0x322a]
0x4017c6:       lea     rcx, [rip + 0x31b3]
0x4017cd:       lea     rdi, [rip - 0xe4]
0x4017d4:       call    qword ptr [rip + 0x2057fe]

接下来就是汇编以及一些可以用于解析的参数,略作展示。

In [46]: bb.capstone
Out[46]: <DisassemblerBlock for 0x4017b0>

In [47]: bb.capstone.pp()
0x4017b0:       xor     ebp, ebp
0x4017b2:       mov     r9, rdx
0x4017b5:       pop     rsi
0x4017b6:       mov     rdx, rsp
0x4017b9:       and     rsp, 0xfffffffffffffff0
0x4017bd:       push    rax
0x4017be:       push    rsp
0x4017bf:       lea     r8, [rip + 0x322a]
0x4017c6:       lea     rcx, [rip + 0x31b3]
0x4017cd:       lea     rdi, [rip - 0xe4]
0x4017d4:       call    qword ptr [rip + 0x2057fe]

In [48]: bb.capstone.insns
Out[48]: 
[<DisassemblerInsn "xor" for 0x4017b0>,
 <DisassemblerInsn "mov" for 0x4017b2>,
 <DisassemblerInsn "pop" for 0x4017b5>,
 <DisassemblerInsn "mov" for 0x4017b6>,
 <DisassemblerInsn "and" for 0x4017b9>,
 <DisassemblerInsn "push" for 0x4017bd>,
 <DisassemblerInsn "push" for 0x4017be>,
 <DisassemblerInsn "lea" for 0x4017bf>,
 <DisassemblerInsn "lea" for 0x4017c6>,
 <DisassemblerInsn "lea" for 0x4017cd>,
 <DisassemblerInsn "call" for 0x4017d4>]

In [49]: bb.capstone.insns[0].op_str
Out[49]: 'ebp, ebp'

In [50]: bb.capstone.insns[0].insn
Out[50]: <CsInsn 0x4017b0 [31ed]: xor ebp, ebp>

In [51]: bb.capstone.insns[0].size
Out[51]: 0x2

至此一小部分的静态分析的模块就介绍好了。

接下来就可以直接做题啦233

我们看回标题,题目要求的就是分析 一堆 IRSB 俺就叫他 SB 好了(名词大师)而且我们也能比较清楚的看到每个 block 尾部都有跳转的地址,也可能有跳转的一些条件。block 的头部有一些寄存器(大概吧,应该可以称作寄存器)

IRSB {
   t0:Ity_I32 t1:Ity_I32 t2:Ity_I32 t3:Ity_I32 t4:Ity_I32 t5:Ity_I32 t6:Ity_I32

   00 | ------ IMark(0x8048464, 3, 0) ------
   01 | t2 = GET:I32(esp)
   02 | t0 = Sub32(t2,0x00000054)
   03 | PUT(cc_op) = 0x00000006
   04 | PUT(cc_dep1) = t2
   05 | PUT(cc_dep2) = 0x00000054
   06 | PUT(cc_ndep) = 0x00000000
   07 | PUT(eip) = 0x08048467
   08 | ------ IMark(0x8048467, 5, 0) ------
   09 | t4 = Sub32(t0,0x00000004)
   10 | PUT(esp) = t4
   11 | STle(t4) = 0x0804846c
   NEXT: PUT(eip) = 0x080485f1; Ijk_Call
}

中间的每一段 IMark 就表示一段汇编转换成的 vex ir ,虽然但是,这个和他的汇编其实不等价,我们应该从一整个 blcok 来看他的 ir 或者 他的 汇编。这样才能理解他在干什么。

有关 vex ir 指令的一些注释,还是可以食用的。

怎么产生的这一段,我们就可以看回之前调用过的 lift ,而 lift 当然是针对不同架构所建立的,也就是 lift 函数会被不同的架构所需要的 lift 重写。

题目很长,给了 4k 行,这其实不是很利于线性的分析。于是乎我们可以写一段 py 脚本提取一下。

import re
import IPython

with open("./oldlog.log","r") as file1:
    data = file1.read()
    file1.close()


target = "IMark\(([\w]*), \d, \d\)"

res = re.findall(target,data)

addr_int_list = [int(x[2:],16) for x in list(set(res))]
addr_int_list.sort()
addr_str_list = [hex(x) for x in addr_int_list]

addr_code = {}

for addr in addr_str_list:
    target1 = '------ IMark\(%s, \d{1,3}, \d{1,3}\) ------\n([\w\W]*?)\n\}'%addr
    target2 = '------ IMark\(%s, \d{1,3}, \d{1,3}\) ------\n([\w\W]*?)   \d{1,3} \| ------ IMark'%addr
    code1 = re.search(target1,data)
    code2 = re.search(target2,data)

    if code1 != None and code2 != None:
        addr_code[addr] = code1.group(1) if len(code1.group(1))<len(code2.group(1)) else code2.group(1)
    elif code2 == None:
        addr_code[addr] = code1.group(1)
    elif code1 == None:
        addr_code[addr] = code2.group(1)

with open("./newlog.log","w") as file2:
    for addr in addr_str_list:
        if(addr == '0x8048554' or addr == '0x8048552'):
            print(addr+'\n')
            print(addr_code[addr])
        file2.writelines(addr+'\n')
        if(addr_code[addr] != '' and addr_code[addr][-1] != '\n'):
            file2.writelines(addr_code[addr]+'\n')
        else:
            file2.writelines(addr_code[addr])
    file2.close()

IPython.embed()

得到了如下 irsb 集合,不想看的话就可以忽略了233,直接跳转到下面

0x8048464
   01 | t2 = GET:I32(esp)
   02 | t0 = Sub32(t2,0x00000054)
   03 | PUT(cc_op) = 0x00000006
   04 | PUT(cc_dep1) = t2
   05 | PUT(cc_dep2) = 0x00000054
   06 | PUT(cc_ndep) = 0x00000000
   07 | PUT(eip) = 0x08048467
0x8048467
   09 | t4 = Sub32(t0,0x00000004)
   10 | PUT(esp) = t4
   11 | STle(t4) = 0x0804846c   # Write a value to memory
   NEXT: PUT(eip) = 0x080485f1; Ijk_Call
0x804846c
   01 | t0 = GET:I32(eax)
   02 | t2 = Add32(t0,0x00001b94)
   03 | PUT(cc_op) = 0x00000003
   04 | PUT(cc_dep1) = t0
   05 | PUT(cc_dep2) = 0x00001b94
   06 | PUT(cc_ndep) = 0x00000000
   07 | PUT(eax) = t2
   08 | PUT(eip) = 0x08048471
0x8048471
   10 | t51 = GET:I16(gs)
   11 | t50 = 16Uto32(t51)
   12 | t5 = GET:I64(ldt)
   13 | t6 = GET:I64(gdt)
   14 | t52 = x86g_use_seg_selector(t5,t6,t50,0x00000014):Ity_I64
   15 | t54 = 64HIto32(t52)
   16 | t53 = CmpNE32(t54,0x00000000)
   17 | if (t53) { PUT(eip) = 0x8048471; Ijk_MapFail }
   18 | t55 = 64to32(t52)
   19 | t56 = LDle:I32(t55)
   20 | PUT(eip) = 0x08048477
0x8048477
   22 | t58 = GET:I32(ebp)
   23 | t57 = Add32(t58,0xfffffff4)
   24 | STle(t57) = t56
0x804847a
   26 | PUT(cc_op) = 0x0000000f
   27 | PUT(cc_dep1) = 0x00000000
   28 | PUT(cc_dep2) = 0x00000000
   29 | PUT(cc_ndep) = 0x00000000
   30 | PUT(eax) = 0x00000000
   31 | PUT(eip) = 0x0804847c
0x804847c
   33 | t60 = Add32(t58,0xffffffce)
   34 | STle(t60) = 0x656d3174
   35 | PUT(eip) = 0x08048483
0x8048483
   37 | t62 = Add32(t58,0xffffffd2)
   38 | STle(t62) = 0x7530795f
   39 | PUT(eip) = 0x0804848a
0x804848a
   41 | t64 = Add32(t58,0xffffffd6)
   42 | STle(t64) = 0x6a6e655f
   43 | PUT(eip) = 0x08048491
0x8048491
   45 | t66 = Add32(t58,0xffffffda)
   46 | STle(t66) = 0x775f7930
   47 | PUT(eip) = 0x08048498
0x8048498
   49 | t68 = Add32(t58,0xffffffde)
   50 | STle(t68) = 0x31743561
   51 | PUT(eip) = 0x0804849f
0x804849f
   53 | t70 = Add32(t58,0xffffffe2)
   54 | STle(t70) = 0x775f676e
   55 | PUT(eip) = 0x080484a6
0x80484a6
   57 | t72 = Add32(t58,0xffffffe6)
   58 | STle(t72) = 0x6e5f3561
   59 | PUT(eip) = 0x080484ad
0x80484ad
   61 | t74 = Add32(t58,0xffffffea)
   62 | STle(t74) = 0x775f746f
   63 | PUT(eip) = 0x080484b4
0x80484b4
   65 | t76 = Add32(t58,0xffffffee)
   66 | STle(t76) = 0x65743561
   67 | PUT(eip) = 0x080484bb
0x80484bb
   69 | t78 = Add32(t58,0xfffffff2)
   70 | STle(t78) = 0x0064
   71 | PUT(eip) = 0x080484c1
0x80484c1
   73 | t80 = Add32(t58,0xffffffac)
   74 | STle(t80) = 0x00000003
   75 | PUT(eip) = 0x080484c8
0x80484c8
   77 | t82 = Add32(t58,0xffffffb4)
   78 | STle(t82) = 0x41
   79 | PUT(eip) = 0x080484cc
0x80484cc
   81 | t84 = Add32(t58,0xffffffb5)
   82 | STle(t84) = 0x42
   83 | PUT(eip) = 0x080484d0
0x80484d0
   85 | t86 = Add32(t58,0xffffffb6)
   86 | STle(t86) = 0x43
   87 | PUT(eip) = 0x080484d4
0x80484d4
   89 | t88 = Add32(t58,0xffffffb7)
   90 | STle(t88) = 0x44
   91 | PUT(eip) = 0x080484d8
0x80484d8
   93 | t90 = Add32(t58,0xffffffb8)
   94 | STle(t90) = 0x45
   95 | PUT(eip) = 0x080484dc
0x80484dc
   97 | t92 = Add32(t58,0xffffffb9)
   98 | STle(t92) = 0x46
   99 | PUT(eip) = 0x080484e0
0x80484e0
   101 | t94 = Add32(t58,0xffffffba)
   102 | STle(t94) = 0x47
   103 | PUT(eip) = 0x080484e4
0x80484e4
   105 | t96 = Add32(t58,0xffffffbb)
   106 | STle(t96) = 0x48
   107 | PUT(eip) = 0x080484e8
0x80484e8
   109 | t98 = Add32(t58,0xffffffbc)
   110 | STle(t98) = 0x49
   111 | PUT(eip) = 0x080484ec
0x80484ec
   113 | t100 = Add32(t58,0xffffffbd)
   114 | STle(t100) = 0x4a
   115 | PUT(eip) = 0x080484f0
0x80484f0
   117 | t102 = Add32(t58,0xffffffbe)
   118 | STle(t102) = 0x4b
   119 | PUT(eip) = 0x080484f4
0x80484f4
   121 | t104 = Add32(t58,0xffffffbf)
   122 | STle(t104) = 0x4c
   123 | PUT(eip) = 0x080484f8
0x80484f8
   125 | t106 = Add32(t58,0xffffffc0)
   126 | STle(t106) = 0x4d
   127 | PUT(eip) = 0x080484fc
0x80484fc
   129 | t108 = Add32(t58,0xffffffc1)
   130 | STle(t108) = 0x4e
   131 | PUT(eip) = 0x08048500
0x8048500
   133 | t110 = Add32(t58,0xffffffc2)
   134 | STle(t110) = 0x4f
   135 | PUT(eip) = 0x08048504
0x8048504
   137 | t112 = Add32(t58,0xffffffc3)
   138 | STle(t112) = 0x50
   139 | PUT(eip) = 0x08048508
0x8048508
   141 | t114 = Add32(t58,0xffffffc4)
   142 | STle(t114) = 0x51
   143 | PUT(eip) = 0x0804850c
0x804850c
   145 | t116 = Add32(t58,0xffffffc5)
   146 | STle(t116) = 0x52
   147 | PUT(eip) = 0x08048510
0x8048510
   149 | t118 = Add32(t58,0xffffffc6)
   150 | STle(t118) = 0x53
   151 | PUT(eip) = 0x08048514
0x8048514
   153 | t120 = Add32(t58,0xffffffc7)
   154 | STle(t120) = 0x54
   155 | PUT(eip) = 0x08048518
0x8048518
   157 | t122 = Add32(t58,0xffffffc8)
   158 | STle(t122) = 0x55
   159 | PUT(eip) = 0x0804851c
0x804851c
   161 | t124 = Add32(t58,0xffffffc9)
   162 | STle(t124) = 0x56
   163 | PUT(eip) = 0x08048520
0x8048520
   165 | t126 = Add32(t58,0xffffffca)
   166 | STle(t126) = 0x57
   167 | PUT(eip) = 0x08048524
0x8048524
   169 | t128 = Add32(t58,0xffffffcb)
   170 | STle(t128) = 0x58
   171 | PUT(eip) = 0x08048528
0x8048528
   173 | t130 = Add32(t58,0xffffffcc)
   174 | STle(t130) = 0x59
   175 | PUT(eip) = 0x0804852c
0x804852c
   177 | t132 = Add32(t58,0xffffffcd)
   178 | STle(t132) = 0x5a
   179 | PUT(eip) = 0x08048530
0x8048530
   181 | t134 = Add32(t58,0xffffffa8)
   182 | STle(t134) = 0x00000000
0x8048537
   NEXT: PUT(eip) = 0x080485c8; Ijk_Boring
0x804853c
   01 | t14 = GET:I32(ebp)
   02 | t13 = Add32(t14,0xffffffce)
   03 | PUT(eip) = 0x0804853f
0x804853f
   05 | t15 = Add32(t14,0xffffffa8)
   06 | t17 = LDle:I32(t15)
0x8048542
   08 | t2 = Add32(t17,t13)
   09 | PUT(eip) = 0x08048544
0x8048544
   11 | t19 = LDle:I8(t2)
   12 | t32 = 8Uto32(t19)
   13 | t18 = t32
   14 | PUT(eax) = t18
0x8048547
   16 | t21 = GET:I8(al)
   17 | t33 = 8Sto32(t21)
   18 | t20 = t33
   19 | PUT(edx) = t20
   20 | PUT(eip) = 0x0804854a
0x804854a
   22 | t22 = Add32(t14,0xffffffac)
   23 | t24 = LDle:I32(t22)
0x804854d
   25 | t7 = Add32(t24,t20)
   26 | PUT(eax) = t7
0x804854f
   28 | PUT(cc_op) = 0x00000006
   29 | PUT(cc_dep1) = t7
   30 | PUT(cc_dep2) = 0x0000005a
   31 | PUT(cc_ndep) = 0x00000000
   32 | PUT(eip) = 0x08048552
0x8048552
   34 | t35 = CmpLE32S(t7,0x0000005a)
   35 | t34 = 1Uto32(t35)
   36 | t30 = t34
   37 | t36 = 32to1(t30)
   38 | t25 = t36
   39 | if (t25) { PUT(eip) = 0x8048554; Ijk_Boring }
   NEXT: PUT(eip) = 0x08048574; Ijk_Boring
0x8048554
   01 | t17 = GET:I32(ebp)
   02 | t16 = Add32(t17,0xffffffce)
   03 | PUT(eip) = 0x08048557
0x8048557
   05 | t18 = Add32(t17,0xffffffa8)
   06 | t20 = LDle:I32(t18)
0x804855a
   08 | t2 = Add32(t20,t16)
   09 | PUT(eip) = 0x0804855c
0x804855c
   11 | t22 = LDle:I8(t2)
   12 | t21 = 8Uto32(t22)
0x804855f
   14 | PUT(eip) = 0x08048561
0x8048561
   16 | t24 = Add32(t17,0xffffffac)
   17 | t26 = LDle:I32(t24)
0x8048564
   19 | t7 = Add32(t26,t21)
0x8048566
   21 | PUT(ecx) = t7
0x8048568
   23 | t28 = Add32(t17,0xffffffce)
   24 | PUT(edx) = t28
   25 | PUT(eip) = 0x0804856b
0x804856b
   27 | t30 = Add32(t17,0xffffffa8)
   28 | t32 = LDle:I32(t30)
0x804856e
   30 | t12 = Add32(t32,t28)
   31 | PUT(cc_op) = 0x00000003
   32 | PUT(cc_dep1) = t32
   33 | PUT(cc_dep2) = t28
   34 | PUT(cc_ndep) = 0x00000000
   35 | PUT(eax) = t12
   36 | PUT(eip) = 0x08048570
0x8048570
   38 | t33 = GET:I8(cl)
   39 | STle(t12) = t33
0x8048572
   NEXT: PUT(eip) = 0x080485c4; Ijk_Boring
0x8048574
   01 | t63 = GET:I32(ebp)
   02 | t62 = Add32(t63,0xffffffce)
   03 | PUT(eip) = 0x08048577
0x8048577
   05 | t64 = Add32(t63,0xffffffa8)
   06 | t66 = LDle:I32(t64)
0x804857a
   08 | t2 = Add32(t66,t62)
   09 | PUT(eip) = 0x0804857c
0x804857c
   11 | t68 = LDle:I8(t2)
   12 | t140 = 8Uto32(t68)
   13 | t67 = t140
   14 | PUT(eax) = t67
0x804857f
   16 | t70 = GET:I8(al)
   17 | t141 = 8Sto32(t70)
   18 | t69 = t141
   19 | PUT(eip) = 0x08048582
0x8048582
   21 | t71 = Add32(t63,0xffffffac)
   22 | t73 = LDle:I32(t71)
0x8048585
   24 | t7 = Add32(t73,t69)
0x8048587
   26 | t74 = Add32(t7,0xffffffa6)
0x804858a
0x804858f
0x8048591
   30 | t14 = MullS32(t74,0x4ec4ec4f)
   31 | t142 = 64HIto32(t14)
   32 | t77 = t142
0x8048593
   34 | t20 = Sar32(t77,0x03)
0x8048596
0x8048598
   37 | t27 = Sar32(t74,0x1f)
0x804859b
   39 | t31 = Sub32(t20,t27)
0x804859d
   41 | PUT(eip) = 0x0804859f
0x804859f
   43 | t103 = Add32(t63,0xffffffb0)
   44 | STle(t103) = t31
   45 | PUT(eip) = 0x080485a2
0x80485a2
   47 | t106 = Add32(t63,0xffffffb0)
   48 | t108 = LDle:I32(t106)
0x80485a5
   50 | t38 = Mul32(t108,0x0000001a)
0x80485a8
   52 | t39 = Sub32(t74,t38)
0x80485aa
   54 | PUT(eip) = 0x080485ac
0x80485ac
   56 | t110 = Add32(t63,0xffffffb0)
   57 | STle(t110) = t39
   58 | PUT(eip) = 0x080485af
0x80485af
   60 | t113 = Add32(t63,0xffffffb0)
   61 | t115 = LDle:I32(t113)
0x80485b2
   63 | t44 = Sub32(t115,0x00000001)
   64 | PUT(eip) = 0x080485b5
0x80485b5
   66 | t117 = Add32(t63,t44)
   67 | t116 = Add32(t117,0xffffffb4)
   68 | t122 = LDle:I8(t116)
   69 | t143 = 8Uto32(t122)
   70 | t121 = t143
   71 | PUT(eax) = t121
0x80485ba
   73 | t123 = Add32(t63,0xffffffce)
   74 | PUT(ecx) = t123
   75 | PUT(eip) = 0x080485bd
0x80485bd
   77 | t125 = Add32(t63,0xffffffa8)
   78 | t127 = LDle:I32(t125)
0x80485c0
   80 | t50 = Add32(t127,t123)
   81 | PUT(edx) = t50
   82 | PUT(eip) = 0x080485c2
0x80485c2
   84 | t128 = GET:I8(al)
   85 | STle(t50) = t128
   86 | PUT(eip) = 0x080485c4
0x80485c4
   88 | t129 = Add32(t63,0xffffffa8)
   89 | t56 = LDle:I32(t129)
   90 | t54 = Add32(t56,0x00000001)
   91 | STle(t129) = t54
   92 | PUT(eip) = 0x080485c8

0x80485c8
   01 | t5 = GET:I32(ebp)
   02 | t4 = Add32(t5,0xffffffa8)
   03 | t2 = LDle:I32(t4)           #A load from memory
   04 | PUT(cc_op) = 0x00000006
   05 | PUT(cc_dep1) = t2
   06 | PUT(cc_dep2) = 0x00000024
   07 | PUT(cc_ndep) = 0x00000000
   08 | PUT(eip) = 0x080485cc
0x80485cc
   10 | t14 = CmpLE32S(t2,0x00000024)
   11 | t13 = 1Uto32(t14)
   12 | t11 = t13
   13 | t15 = 32to1(t11)
   14 | t6 = t15
   15 | if (t6) { PUT(eip) = 0x804853c; Ijk_Boring }
   NEXT: PUT(eip) = 0x080485d2; Ijk_Boring
0x80485d2
   01 | PUT(eax) = 0x00000000
   02 | PUT(eip) = 0x080485d7
0x80485d7
   04 | t10 = GET:I32(ebp)
   05 | t9 = Add32(t10,0xfffffff4)
   06 | t11 = LDle:I32(t9)
   07 | PUT(ecx) = t11
   08 | PUT(eip) = 0x080485da
0x80485da
   10 | t13 = GET:I16(gs)
   11 | t25 = 16Uto32(t13)
   12 | t12 = t25
   13 | t5 = GET:I64(ldt)
   14 | t6 = GET:I64(gdt)
   15 | t26 = x86g_use_seg_selector(t5,t6,t12,0x00000014):Ity_I64
   16 | t14 = t26
   17 | t27 = 64HIto32(t14)
   18 | t16 = t27
   19 | t15 = CmpNE32(t16,0x00000000)
   20 | if (t15) { PUT(eip) = 0x80485da; Ijk_MapFail }
   21 | t28 = 64to32(t14)
   22 | t17 = t28
   23 | t2 = LDle:I32(t17)
   24 | t1 = Xor32(t11,t2)
   25 | PUT(cc_op) = 0x0000000f
   26 | PUT(cc_dep1) = t1
   27 | PUT(cc_dep2) = 0x00000000
   28 | PUT(cc_ndep) = 0x00000000
   29 | PUT(ecx) = t1
   30 | PUT(eip) = 0x080485e1
0x80485e1
   32 | t30 = CmpEQ32(t1,0x00000000)
   33 | t29 = 1Uto32(t30)
   34 | t23 = t29
   35 | t31 = 32to1(t23)
   36 | t18 = t31
   37 | if (t18) { PUT(eip) = 0x80485e8; Ijk_Boring }
   NEXT: PUT(eip) = 0x080485e3; Ijk_Boring
0x80485e8
   01 | t2 = GET:I32(esp)
   02 | t0 = Add32(t2,0x00000054)
   03 | PUT(cc_op) = 0x00000003
   04 | PUT(cc_dep1) = t2
   05 | PUT(cc_dep2) = 0x00000054
   06 | PUT(cc_ndep) = 0x00000000
   07 | PUT(esp) = t0
   08 | PUT(eip) = 0x080485eb
0x80485eb
   10 | t3 = LDle:I32(t0)
   11 | t10 = Add32(t0,0x00000004)
   12 | PUT(esp) = t10
   13 | PUT(ecx) = t3
   14 | PUT(eip) = 0x080485ec
0x80485ec
   16 | t5 = LDle:I32(t10)
   17 | PUT(ebp) = t5
0x80485ed
   19 | t12 = Add32(t3,0xfffffffc)
   20 | PUT(esp) = t12
   21 | PUT(eip) = 0x080485f0
0x80485f0
   23 | t9 = LDle:I32(t12)
   24 | t14 = Add32(t12,0x00000004)
   25 | PUT(esp) = t14
   NEXT: PUT(eip) = t9; Ijk_Ret
0x80485f1
   01 | t0 = GET:I32(esp)
   02 | t3 = LDle:I32(t0)
   03 | PUT(eax) = t3
   04 | PUT(eip) = 0x080485f4
0x80485f4
   06 | t2 = LDle:I32(t0)
   07 | t4 = Add32(t0,0x00000004)
   08 | PUT(esp) = t4
   NEXT: PUT(eip) = t2; Ijk_Ret

我们可以根据语义,去分析每一段block做了什么,跳转到哪里。

当然,我们也可以倒过来写汇编,毕竟就只有 110 行左右的汇编。

说到底,其实我们就是人工做了语法分析的逻辑。
以下是我写的伪代码。浓缩到110行了。直接看 irsb 也是可以的,甚至我这种人肉反汇编其实是比较浪费时间的。

0x8048464   sub esp,0x58
0x8048467   mov [esp],0x0804846c -> call 0x804846c # 写地址 CALL
0x804846c   add eax,0x1b94
0x8048471   
0x8048477   mov [ebp-0x0C],[x86g_use_seg_selector(t5,t6,t50,0x00000014)]
0x804847a   mov eax,0
0x804847c   mov.d [ebp-0x32],0x656d3174
0x8048483   mov.d [ebp-0x2E],0x7530795f
0x804848a   mov.d [ebp-0x2A],0x6a6e655f
...
0x80484b4   mov.d [ebp-0x12],0x65743561
0x80484bb   mov.w [ebp-0x0E],0x0064
0x80484c1   mov.d [ebp-0x54],0x00000003
0x80484c8   mov.b [ebp-0x4C],0x41
0x80484cc   mov.b [ebp-0x4B],0x42
...
0x804852c   mov.b [ebp-0x33],0x5a
0x8048530   mov.d [ebp-0x58],0x00000000
0x8048537   jmp 0x80485c8-----------------------------|


0x804853c   lea edx,[ebp-0x32]<-------|   # add a2,[ebp-0x58],ebp-0x32
0x804853f   mov.d eax,[ebp-0x58]
0x8048542   add eax,edx
0x8048544   mov.b eax,[eax]
0x8048547   mov edx,al                  # get a byte
0x804854a   mov eax,[ebp-0x54]          # 所有的都会位移3
0x804854d   add eax,edx
0x804854f   cmp eax,0x0000005a
0x8048552   jle 0x8048554---------------| jmp  0x8048574            

0x8048554   lea edx,[ebp-0x32]                            
0x8048557   mov eax,[ebp-0x58]
0x804855a   add edx,eax
0x804855c   mov.bb al,[edx]
0x804855f                               #位移3
0x8048561   mov t26,[ebp-0x54]
0x8048564   add t7,t26,eax
0x8048566   mov ecx,t7
0x8048568   lea edx,[ebp-0x32]
0x804856b   
0x804856e   mov eax,(edx+[ebp-0x58])
0x8048570   mov [eax],cl
0x8048572   jmp 0x80485c4

                                        |
                                        |
0x8048574   lea edx,[ebp-0x32]<---------|
0x8048577   mov ecx,[ebp-0x58]
0x804857a   add ecx,edx
0x804857c   mov.b ecx,[ecx]
0x804857f   mov edx,al                  #(al+3-0x5A)*0x4ec4ec4f>>3
0x8048582   mov ecx,[ebp-0x54]
0x8048585   add ecx,edx
0x8048587   sub ecx,0x5A
0x8048591   mul 0x4ec4ec4f              # ecx = (I32high)(rcx*0x4ec4ec4f)
0x8048593   shr ecx,0x3
0x8048596   mov edx,[ebp-0x54]
0x8048598   shr edx,0x1f
0x804859b   sub edx,eax
0x804859d   jmp 0x0804859f              
0x804859f   [ebp-0x50],edx
0x80485a2   mov eax,[ebp-0x50]          #eax=eax*0x0000001a
0x80485a5   mul 0x0000001a
0x80485a8   sub ecx,eax
0x80485aa   nop
0x80485ac   mov [ebp-0x50],ecx
0x80485af   mov ecx,[ebp-0x50]
0x80485b2   sub ecx,0x00000001
0x80485b5   mov.b eax,[ebp+ecx-0x4C]  # 根据 ecx 查表
0x80485ba   lea ecx,[ebp-0x32]
0x80485bd   mov t127,[ebp-0x58]
0x80485c0   mov edx,t127+ecx
0x80485c2   mov.b [edx],eax
0x80485c4   add [ebp-0x58],0x1---------|
0x80485c8   cmp [ebp-0x58],0x24------------|    |
0x80485cc   jnz 0x804853c ----------------------|

0x80485d2   mov eax,0x0
0x80485d7   mov ecx,[ebp-0x0C]
0x80485da   xor ecx,[x86g_use_seg_selector(t5,t6,t12,0x00000014)]
0x80485e1   cmp ecx,0x00;   je 0x80485e8;jmp out
0x80485e8   add esp,0x54
0x80485eb   mov ecx,[esp] ; add esp,0x04
0x80485ec   mov ebp,[esp]
0x80485ed   
0x80485f0   ret
...

了解到栈里面不同位置干什么用了之后,可以看到程序的逻辑就是位移以及一段取模

解密脚本

byte_stream = [0x74,0x31,0x6d,0x65,
                0x5f,0x79,0x30,0x75,
                0x5f,0x65,0x6e,0x6a,
                0x30,0x79,0x5f,0x77,
                0x61,0x35,0x74,0x31,
                0x6e,0x67,0x5f,0x77,
                0x61,0x35,0x5f,0x6e,
                0x6f,0x74,0x5f,0x77,
                0x61,0x35,0x74,0x65,
                0x64]
# table = [0x41...0x5a]
table = [0x41 + num for num in range(0,0x5a-0x41+1)]

for i,value in enumerate(byte_stream):
    if value + 3 <= 0x0000005a:
        byte_stream[i] = value + 3
    else:
        t74 = 3 + value - 0x5a
        mul1 = (t74 * 0x4ec4ec4f) >>32
        t20 = mul1 >> 3
        t27 = (value + 3 -0x5A) >> 0x1f
        mul2 = ((t20 - t27) * (0x1a)) & 0xFFFFFFFF
        sub = value + 3 - 0x5A - mul2
        byte_stream[i] = table[sub-1]

print(bytes(byte_stream))

# t20 = Sar32(from64HIto32(MullS32(from8to32(ch) + 0x00000003 - 0x5A,0x4ec4ec4f)),0x03)
# t27 = Sar32(from8to32(ch) + 0x00000003 - 0x5A,0x1f)
# load_ch(ebp + Sub32(from8to32(ch) + 0x00000003 - 0x5A,Mul32(Sub32(t20,t27),0x0000001a)) - 0x00000001 - 0x4C)

# 2**35 / 0x4ec4ec4f = 25L