MidnightSunCTF 2024 07u4 (Reversing) Writeups
07u4 - MidnightSunCTF 2024 Quals Writeup
Introduction
This weekend I participated with idek and got 6th in MSun Quals. Unfortunately I didn’t capture any flag because the team is amazing (JoshL op), but I solved the 07u4 reversing challenge while the CTF was running for myself. It was an interesting auto analysis challenge which gave different types of binaries from the instance, and you had to find the solution for them. I’ve wrote a kinda similar challenge for HTB, ARMs Race (shameless plug), and I’ve solved a few, so it was a nice challenge to tackle. Before starting, I need to mention that the author of the challenge is quend, who tragically lost her life a few weeks before the CTF. This writeup is in honor of her.
Playing around
Unfortunately the instance is down as of the time of writing this writeup, So I can’t show the interaction with it. But basically it had 2 options, a help and a play. The help option just stated that we will get 25 levels, and each binary will be sent to us gzip compressed. When connecting, we got this sort of message and prompt.
1BIN 01/25:
2<gzip compressed binary in hex here>
3ANSWER:
4<send answer here>
So I just wrote a simple template with pwntools for communication.
1from pwn import *
2import gzip
3
4def start_game():
5 io.recvuntil(b'Play\n')
6 io.sendline(b'2')
7
8def get_level():
9 io.recvuntil(b'BIN ')
10 lvl_idx = io.recv(2).decode()
11 io.recvuntil(b'25: \n')
12 level = gzip.decompress(bytes.fromhex(io.recvline().rstrip().decode()))
13 with open(f'lvl{lvl_idx}', 'wb') as f:
14 f.write(level)
15 return level
16
17def analyze(exe):
18 pass
19
20def give_answer(ans):
21 io.recvuntil(b'ANSWER: \n')
22 io.sendline(f'{ans}'.encode())
23
24def pwn():
25 start_game()
26 for _ in range(25):
27 level = get_level()
28 answer = analyze(level)
29 give_answer(answer)
30 io.interactive()
31
32if __name__ == '__main__':
33 ip = '07u4-1.play.hfsc.tf'
34 port = 3991
35 io = remote(ip, port)
36 pwn()
Obviously this changed whilst moving forward, but it was a good template to start. Let’s begin by showing the first binary I got using binja.
1000011a0 if (argc s<= 1)
2000011a2 rax = -1
3000011a0 else
4000011a9 int32_t var_1c_1 = 0
5000011b0 int32_t var_18_1 = 0x5b
6000011b7 int32_t var_14_1 = 0x5b
7000011c3 char* rax_1 = malloc(bytes: 0x5b)
8000011e2 memcpy(rax_1, "3nIIiRmSZMrp6Ko75ib8pL3EbwzdcWu7…", 0x5b)
900001203 if (strcmp(rax_1, argv[1]) == 0)
1000001205 var_1c_1 = 1
110000120c rax = var_1c_1
1200001210 return rax
We can see that it’s fairly simple, it receives our solution as a command line argument, allocates a heap chunk to copy the solution that exists inside the binary, and just executes a strcmp (could have called strcmp immediately, but from looking at the other binaries later on, the malloc seems to have been given to every binary from the backend). Initially I thought this was the main idea and the next levels would play around this. So sort of increasing the difficulty every n levels you pass. But that was not the case.
Zerotistic found out that different binaries can be found on level 1 where he found another xor related one, and another bitwise operations related one. At that point I thought that difficulty is not escalated, but rather chosen at random. JoshL concluded that there were only these 3 levels. Below are 2 decompilations for the xor version and the bitwise version.
1// xor version
20000115c void* fsbase
30000115c int64_t rax = *(fsbase + 0x28)
40000116f int32_t rax_2
50000116f if (argc s<= 1)
600001171 rax_2 = -1
70000116f else
8000011ad void var_24
9000011ad for (int32_t i = 0; i u<= 9; i = i + 1)
100000119f *(&var_24 + sx.q(i)) = argv[1][sx.q(i)]
11000011af int32_t var_2c_1 = 1
12000011b6 char var_1a = 0x1e
13000011ba char var_19_1 = 0x22
14000011be char var_18_1 = 7
15000011c2 char var_17_1 = 0
16000011c6 char var_16_1 = 0x68
17000011ca char var_15_1 = 3
18000011ce char var_14_1 = 0x3a
19000011d2 char var_13_1 = 0x35
20000011d6 char var_12_1 = 0x12
21000011da char var_11_1 = 0x39
220000121c for (int32_t i_1 = 0; i_1 s<= 9; i_1 = i_1 + 1)
230000120b if (zx.d((&var_1a)[sx.q(i_1)] ^ *(&var_24 + sx.q(i_1)) ^ 0x50) != 0)
240000120d var_2c_1 = 0
250000121e rax_2 = var_2c_1
2600001225 *(fsbase + 0x28)
270000122e if (rax == *(fsbase + 0x28))
2800001236 return rax_2
2900001230 __stack_chk_fail()
3000001230 noreturn
1// bitwise version
2000011c6 void* fsbase
3000011c6 int64_t rax = *(fsbase + 0x28)
4000011dc int32_t rax_2
5000011dc if (argc s<= 1)
6000011de rax_2 = -1
7000011dc else
8000011ed char* rax_3 = malloc(bytes: 0x5c)
90000120d memset(rax_3, 0xff, 0x5c)
1000001212 int32_t var_100_1 = 0
1100001269 while (true)
1200001271 if (sx.q(var_100_1) u>= strlen(rax_3))
1300001271 break
1400001247 rax_3[sx.q(var_100_1)] = argv[1][sx.q(var_100_1)]
150000124f var_100_1 = var_100_1 + 1
160000127d int32_t rax_15 = strlen(rax_3)
1700001288 int32_t var_fc_1 = 1
1800001292 int16_t var_e8
1900001292 __builtin_memcpy(dest: &var_e8, src: "\x2e\xc2\x9c\xf1\x1e\xce\xbf\xc3\x72\xc2\xcc\xc7\x43\xed\x76\xd8\xd1\xe6\x9d\xde\x69\xc4\x4a\xdf\xc1\xd3\x33\xf5\x51\xd1\xfa\xc9\xeb\xc6\xe2\xdb\xf4\xf5\x40\xce\x8f\xf9\xe5\xf0\xd2\xd9\xb3\xe1\x27\xdf\x98\xd7\x20\xd9\xb2\xf2\xff\xcf\x95\xe9\xa3\xda\x9a\xcc\xcc\xd9\xc7\xd9\x5e\xc1\xb6\xc0\xff\xef\x72\xfd\xc2\xca\x7e\xc2\x35\xd3\x7d\xec\xaf\xf3\xed\xfb\x37\xee\x5c\xd1", n: 0x5c)
2000001430 int16_t var_88 = 0x1a93
2100001436 int16_t var_86_1 = 0x67e8
220000143c int16_t var_84
230000143c __builtin_strncpy(dest: &var_84, src: "o4\r;", n: 4)
2400001448 int16_t var_80
2500001448 __builtin_memcpy(dest: &var_80, src: "\xb6\x0f\x3f\x15\xb8\x45\xd0\x1a\x28\x58\x14\x29\xce\x3b\x9a\x57\x37\x44\xa3\x56\xbf\x48\x4b\x0c\x34\x32\x5a\x1d\x65\x4b\xb8\x3b\xdf\x5e", n: 0x22)
26000014ae int16_t var_5e
27000014ae __builtin_strncpy(dest: &var_5e, src: "_g=/", n: 4)
28000014ba int16_t var_5a
29000014ba __builtin_memcpy(dest: &var_5a, src: "\xf9\x43\x92\x45\xfd\x40\x9a\x4c\x08\x60\x45\x33\x05\x38\xf9\x2a\x04\x13\x3b\x1d\x10\x28\xd7\x38\x27\x1b\x73\x40\xb4\x45\x2c\x12\xcb\x0f\x86\x37\xd2\x57\x03\x36\x35\x69", n: 0x2a)
3000001538 int16_t var_30_1 = 0x42ac
310000153e int16_t var_2e_1 = 0x40ce
32000015ea for (int32_t i = 0; i s< (rax_15 + (rax_15 u>> 0x1f)) s>> 1; i = i + 1)
330000156e int64_t rdx_4
340000156e rdx_4.w = sx.w(rax_3[sx.q(i * 2) + 1])
35000015c2 if ((&var_e8)[sx.q(i)] + ((sx.d(rax_3[sx.q(i * 2)]) << 8).w | rdx_4.w) != (&var_88)[sx.q(i)])
36000015c4 var_fc_1 = 0
37000015f0 rax_2 = var_fc_1
38000015fa *(fsbase + 0x28)
3900001603 if (rax == *(fsbase + 0x28))
400000160f return rax_2
4100001605 __stack_chk_fail()
4200001605 noreturn
We will get into analyzing them later. My first general approach was to get the binaries, compile them, and solve them with angr. Though this would have been a good general solution, we don’t know the address of the line we want to get to. So we somehow would need to analyze the binary to get that information. But that also opens up the possibility of performing static analysis on the binary with frameworks like capstone, which I decided to use and base my solution on it. Note that capstone produces disassembly from bytes, so from now on we will be looking at the disassembly for producing our actual solution (will still use decompilation for better comprehension)
A great tool to have used would have been ther Binary Ninja API, which gets much praise from CTF players, and for good reason. I didn’t go with this because I don’t have a headless version, which lets you use the API and run scripts without using the binja GUI. Looking back at it, I probably could have solved it regardless, so skill issue. It would have been tedious to rely on the script editor, but alas it wasn’t meant for this one. Hope next time I get my hands dirty with it.
On the lookout for heuristics
A good part of automatic analysis is just observing patters that exist in a fairly large pool of samples (in our case the binaries), and trying to make your solution covers as many cases as you can. If we think of what we need to do, we conclude we need to do 2 things:
- Identify the case we have to solve
- Solve it :peeposmart:
Let’s cover the first step. Before identifying what we have to solve, we first need to reach main. To do that, we can get the address of _start from the elf header with pyelftools, analyze that function with capstone, and find the address of main. Let’s look at a _start sample with capstone.
10x1060 endbr64
20x1064 xor ebp, ebp
30x1066 mov r9, rdx
40x1069 pop rsi
50x106a mov rdx, rsp
60x106d and rsp, 0xfffffffffffffff0
70x1071 push rax
80x1072 push rsp
90x1073 xor r8d, r8d
100x1076 xor ecx, ecx
110x1078 lea rdi, [rip + 0xca]
120x107f call qword ptr [rip + 0x2f53]
130x1085 hlt
Even though it isn’t resolved, we know that the function called at 0x107f is __libc_start_main, and the pointer passed to rdi is the address of main. So we just need to get the value being added to rip, and that will be the start of main.
1 with open(flvl, 'rb') as f:
2 elf = ELFFile(f)
3 header = elf.header
4 entry_point = header['e_entry']
5 md = Cs(CS_ARCH_X86, CS_MODE_64)
6 for dasm in md.disasm(exe[entry_point:], entry_point, 0xd):
7 if dasm.mnemonic == 'lea':
8 rip_offset = int(dasm.op_str.split(' + ')[-1][:-1], 16)
9 main_addr = dasm.address + rip_offset + 0x7 # the 7 for the next rip
10 break
Now that we have the start of main, let’s quickly find the end of it by searching for the first ret instruction in our path
1 for dasm in md.disasm(exe[main_addr:], main_addr):
2 if dasm.mnemonic == 'ret':
3 main_end = dasm.address + 0x1
4 break
That 0x1 is used because otherwise ret wouldn’t be included in the disassembly of the entire function (doesn’t matter but I wanted it for sake of completion). After this, I decided to get the entire disassembly of main, so that I could use it to find which case we have, and later to solve them
1 main_disasm = ''
2 for dasm in md.disasm(exe[main_addr:main_end], main_addr):
3 main_disasm += f'{hex(dasm.address)}\t{dasm.mnemonic}\t{dasm.op_str}\n'
As a sidenote, these are the main things you need to know about capstone to use it (they are also the only things I know about it hahaha).
Now, how can we find out which case we are dealing with? At first I thought about doing something with xor because xor is certainly used in case 2, so maybe we could search if xor eax, eax is not the only xor instruction we have in the binary, then most likely we are on case 2. However I didn’t go with this. Instead I went with some other heuristics:
- Case 1: This case is about moving a char array pointer to
rdibefore amemcpywithlea, and this is a unique behaviour. The rest of the cases pass values on the stack, whereas every string literal is stored in.rodata, so to get a pointer to itleais used. Thus we search for'lea\trcx, [rip + 0x' - Case 2: We will search for the start of where the xor key will be stored in the binary (000011b6 - 000011da in the above example). From looking at different samples, the array always starts at offset 0x12. So we can search for
'mov\tbyte ptr [rbp - 0x12],'. If that wasn’t the case we might have needed to use the above xor trick. - Case 3: A simple
else
1 for dasm in md.disasm(exe[main_addr:main_end], main_addr):
2 main_disasm += f'{hex(dasm.address)}\t{dasm.mnemonic}\t{dasm.op_str}\n'
3 if 'lea\trcx, [rip + 0x' in main_disasm:
4 solution = find_string_case(flvl)
5 elif 'mov\tbyte ptr [rbp - 0x12],' in main_disasm:
6 solution = xor_case(main_disasm)
7 else:
8 solution = bitwise_case(main_disasm)
And that’s the end of the first step. Let’s move onto the second one.
Manually parsing assembly and feeling like a fool
I will split this section into three subsections, one for each case.
Case 1 - Find String
This is the easiest of them all. Since the solution is inside the binary as a string, we can just run strings and find it. This is why for this function we pass the name of the file as an argument. We can have a blacklist of unwanted strings, and search based on them (yoinked Josh’s function, but simple enough).
1def find_string_case(fexe):
2 wrong = [b'/lib64/ld-linux-x86-64.so.2', b'__cxa_finalize', b'__libc_start_main', b'__stack_chk_fail',
3 b'GLIBC_2.14', b'GLIBC_2.2.5', b'GLIBC_2.34', b'_ITM_deregisterTMCloneTable', b'__gmon_start__',
4 b'_ITM_registerTMCloneTable', b'GCC: (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0', b'.note.gnu.property',
5 b'.note.gnu.build-id', b'.note.ABI-tag', b'.gnu.version', b'.gnu.version_r', b'.eh_frame_hdr',
6 b'.init_array', b'.fini_array']
7 s = run(['strings', '-n 10', fexe], capture_output=True)
8 for line in s.stdout.strip().split(b'\n'):
9 if line not in wrong:
10 return line
Case 2 - Xor Case
Let’s look again at what’s going on in the decompilation. First essentially an array of uint8 values is created.
1000011b6 char var_1a = 0x1e
2000011ba char var_19_1 = 0x22
3000011be char var_18_1 = 7
4000011c2 char var_17_1 = 0
5000011c6 char var_16_1 = 0x68
6000011ca char var_15_1 = 3
7000011ce char var_14_1 = 0x3a
8000011d2 char var_13_1 = 0x35
9000011d6 char var_12_1 = 0x12
10000011da char var_11_1 = 0x39
And then a simple xor decryption loop.
10000121c for (int32_t i_1 = 0; i_1 s<= 9; i_1 = i_1 + 1)
20000120b if (zx.d((&var_1a)[sx.q(i_1)] ^ *(&var_24 + sx.q(i_1)) ^ 0x50) != 0)
From 120b we know the above chars, and the constant. So we can recover our input. Example solution for the above case.
1>>> ct = [0x1e, 0x22, 0x7, 0x0, 0x68, 0x3, 0x3a, 0x35, 0x12, 0x39]
2>>> ''.join([chr(i^0x50) for i in ct])
3'NrWP8SjeBi'
There are two things we need to get from the disassembly, the single byte, and the ciphertext. We will recover the ciphertext based on the same way we identified this case, we will look for values stored on the stack at offsets 0x12-0x9. A small problem that came from this is that capstone prints the 9 as a simple integer, not in a hexadecimal format (9 not 0x9 or 0x09). So we needed to cover this small case as well.
Some small problems also occured with the single byte. Originally the disassembly I saw for it was xor al, 0x50, so I thought that the al register would be universal due to compiler optimizations. However it later emerged that we can also have the eax register used. So I accounted for that as well. Another small heuristic I had was that the number would not be 0-9, which was also a mistake. We could not take this case into consideration and leave a small chances of getting such a small xor byte, but I took care of that as well. Below is the complete function code.
1def xor_case(dasm):
2 ct = bytearray()
3 split_dasm = dasm.split('\n')[:-1]
4 ctr = 0x12
5 for line in split_dasm:
6 dasm = line.split('\t')
7 if dasm[1] == 'mov' and (f'byte ptr [rbp - {hex(ctr)}]' in dasm[2] or f'byte ptr [rbp - {ctr}]' in dasm[2]):
8 ct += bytes([int(dasm[2].split(', ')[-1], 16)])
9 ctr -= 1
10 if dasm[1] == 'xor' and (('al, 0x' in dasm[2] or 'eax, 0x' in dasm[2]) or any(n in dasm[2] for n in [str(i) for i in range(10)])):
11 key_byte = bytes([int(dasm[2].split(', ')[-1], 16)])
12 return xor(ct, key_byte)
Case 3 - Bitwise Case
This was the “hardest”, though it’s also not that difficult to analyze. The decompilation doesn’t do a good job of showing the 2 continous arrays, so here’s a big disassembly dump of uint16 bytes.
10x1214 mov word ptr [rbp - 0x90], 0xfdfc
20x121d mov word ptr [rbp - 0x8e], 0xed9e
30x1226 mov word ptr [rbp - 0x8c], 0xf470
40x122f mov word ptr [rbp - 0x8a], 0xd780
50x1238 mov word ptr [rbp - 0x88], 0xf09a
60x1241 mov word ptr [rbp - 0x86], 0xe68a
70x124a mov word ptr [rbp - 0x84], 0xd9e4
80x1253 mov word ptr [rbp - 0x82], 0xeefd
90x125c mov word ptr [rbp - 0x80], 0xc749
100x1262 mov word ptr [rbp - 0x7e], 0xc4ac
110x1268 mov word ptr [rbp - 0x7c], 0xdfbe
120x126e mov word ptr [rbp - 0x7a], 0xe8f8
130x1274 mov word ptr [rbp - 0x78], 0xf5b7
140x127a mov word ptr [rbp - 0x76], 0xf9ef
150x1280 mov word ptr [rbp - 0x74], 0xee2c
160x1286 mov word ptr [rbp - 0x72], 0xfb67
170x128c mov word ptr [rbp - 0x70], 0xe207
180x1292 mov word ptr [rbp - 0x6e], 0xed17
190x1298 mov word ptr [rbp - 0x6c], 0xfed2
200x129e mov word ptr [rbp - 0x6a], 0xd3b8
210x12a4 mov word ptr [rbp - 0x68], 0xf0da
220x12aa mov word ptr [rbp - 0x66], 0xcd6c
230x12b0 mov word ptr [rbp - 0x64], 0xc45f
240x12b6 mov word ptr [rbp - 0x62], 0xfbee
250x12bc mov word ptr [rbp - 0x60], 0xee62
260x12c2 mov word ptr [rbp - 0x5e], 0xfdca
270x12c8 mov word ptr [rbp - 0x5c], 0xdd35
280x12ce mov word ptr [rbp - 0x5a], 0xd79b
290x12d4 mov word ptr [rbp - 0x58], 0xf93c
300x12da mov word ptr [rbp - 0x56], 0xec10
310x12e0 mov word ptr [rbp - 0x54], 0xc35e
320x12e6 mov word ptr [rbp - 0x50], 0x6d3e
330x12ec mov word ptr [rbp - 0x4e], 0x3002
340x12f2 mov word ptr [rbp - 0x4c], 0x67e1
350x12f8 mov word ptr [rbp - 0x4a], 0x3ef4
360x12fe mov word ptr [rbp - 0x48], 0x69e6
370x1304 mov word ptr [rbp - 0x46], 0x5fd2
380x130a mov word ptr [rbp - 0x44], 0x514a
390x1310 mov word ptr [rbp - 0x42], 0x5f5f
400x1316 mov word ptr [rbp - 0x40], 0x3b91
410x131c mov word ptr [rbp - 0x3e], 0x3511
420x1322 mov word ptr [rbp - 0x3c], 0x2629
430x1328 mov word ptr [rbp - 0x3a], 0x4c3e
440x132e mov word ptr [rbp - 0x38], 0x3b2e
450x1334 mov word ptr [rbp - 0x36], 0x4067
460x133a mov word ptr [rbp - 0x34], 0x2f80
470x1340 mov word ptr [rbp - 0x32], 0x62c8
480x1346 mov word ptr [rbp - 0x30], 0x277d
490x134c mov word ptr [rbp - 0x2e], 0x4179
500x1352 mov word ptr [rbp - 0x2c], 0x4522
510x1358 mov word ptr [rbp - 0x2a], 0x4c31
520x135e mov word ptr [rbp - 0x28], 0x5749
530x1364 mov word ptr [rbp - 0x26], 0x40e0
540x136a mov word ptr [rbp - 0x24], 0xfd0
550x1370 mov word ptr [rbp - 0x22], 0x4242
560x1376 mov word ptr [rbp - 0x20], 0x33d7
570x137c mov word ptr [rbp - 0x1e], 0x740e
580x1382 mov word ptr [rbp - 0x1c], 0x517f
590x1388 mov word ptr [rbp - 0x1a], 0x25e3
600x138e mov word ptr [rbp - 0x18], 0x6c7e
610x1394 mov word ptr [rbp - 0x16], 0x4265
620x139a mov word ptr [rbp - 0x14], 0x119f
The first thing I did (which in retrospect wasn’t really needed and I lost my time with, will explain later why) was get the length of the input and ciphertext from the disassembly. A nice place it’s stored is right above the malloc call, stored in rdi. So I wrote a loop to find a mov to rdi, and a call on the next line. Considering it’s the first call of the function, we will always find it correctly.
1 split_dasm = dasm.split('\n')[:-1]
2 for i in range(len(split_dasm)):
3 dasm = split_dasm[i].split('\t')
4 next_dasm = split_dasm[i+1].split('\t')
5 if dasm[1] == 'mov' and 'edi, 0x' in dasm[2] and next_dasm[1] == 'call':
6 ct_len = int(dasm[2].split(', ')[-1], 16)
7 break
Before moving on, let’s solve the above case. If we look at the operations being performed from the decompilation, we can write it anew to help us
1data[i] + ((input[2*i] << 8) | input[2*i+1]) == ct[i] =>
2(input[2*i] << 8) | input[2*i+1] == ct[i] - data[i]
From the bytes we get that the first half of them belong to this data array, and the second half are the elements of ct. We can move the data[i] to the right half, and we get a better equation. To find our input, we can iterate over an alphabet for the left side until we get a hit on the result of the right value, or, much more cleverly, get the result of ct[i] - data[i] & 0xffff. The reason for that is that (input[2*i] << 8) | input[2*i+1] is just (input[2*i] << 8) + input[2*i+1], because the 8 bits that are shifted to the right leave room for all the bits of input[2*i+1]. Also this 2*i+1 just means we get the input bytes in groups of 2, i.e. we end up with a uint16 value.
So we get the numerical values of the bytes from the disassembly, split the values in half with the length we got (that’s why it’s unecessary, we could split in half just with len(data)), and calculate the solution. Note that we can turn the data from a number to bytes with <int>.to_bytes(2, 'little'). Example solution below.
1>>> total_data = [65020, 60830, 62576, 55168, 61594, 59018, 55780, 61181, 51017, 50348, 57278, 59640, 62903, 63983, 60972, 64359, 57863, 60695, 65234, 54200, 61658, 52588, 50271, 64494, 61026, 64970, 56629, 55195, 63804, 60432, 50014, 27966, 12290, 26593, 16116, 27110, 24530, 20810, 24415, 15249, 13585, 9769, 19518, 15150, 16487, 12160, 25288, 10109, 16761, 17698, 19505, 22345, 16608, 4048, 16962, 13271, 29710, 20863, 9699, 27774, 16997, 4511]
2>>> data = total_data[:len(total_data)//2]
3>>> data
4[65020, 60830, 62576, 55168, 61594, 59018, 55780, 61181, 51017, 50348, 57278, 59640, 62903, 63983, 60972, 64359, 57863, 60695, 65234, 54200, 61658, 52588, 50271, 64494, 61026, 64970, 56629, 55195, 63804, 60432, 50014]
5>>> ct = total_data[len(total_data)//2:]
6>>> ct
7[27966, 12290, 26593, 16116, 27110, 24530, 20810, 24415, 15249, 13585, 9769, 19518, 15150, 16487, 12160, 25288, 10109, 16761, 17698, 19505, 22345, 16608, 4048, 16962, 13271, 29710, 20863, 9699, 27774, 16997, 4511]
8>>> b''.join([((c - d) & 0xffff).to_bytes(2, 'little') for c, d in zip(ct, data)])
9b'BodBqstgLyHyfwbpHtepkFFcwExFTAagvEbTPFyxoftsqKTFuEDvJtHNBsUVAN'
And the rest of the function.
1 pattern = r'\b0x[0-9A-Fa-f]{3,4}\b'
2 for line in split_dasm:
3 dasm = line.split('\t')
4 matches = re.findall(pattern, dasm[2])
5 if dasm[1] == 'mov' and 'word ptr [rbp -' in dasm[2] and not any(register in dasm[2] for register in ['edi', 'rdi', 'esi', 'rsi', 'eax', 'rax']) and matches:
6 important_bytes.append(int(matches[0][2:], 16))
7 processing_data = important_bytes[:ct_len//2]
8 ct = important_bytes[ct_len//2:]
9 return b''.join([((c - d) & 0xffff).to_bytes(2, 'little') for c, d in zip(ct, data)])
Lastly, just to mention that I decided to get the data I went with a regex. However there were cases where the hex numbers were 3 bytes, so I had to do some more maneuvering to get the values. Since rbp was the only register we cared for, I blacklisted some registers that where and could be problematic.
Getting the flag
Below is the complete solver
1from elftools.elf.elffile import ELFFile
2from subprocess import run
3from capstone import *
4from pwn import *
5import gzip
6import re
7
8def start_game():
9 io.recvuntil(b'Play\n')
10 io.sendline(b'2')
11
12def get_level():
13 io.recvuntil(b'BIN ')
14 lvl_idx = io.recv(2).decode()
15 fname = f'lvl{lvl_idx}'
16 io.recvuntil(b'25: \n')
17 level = gzip.decompress(bytes.fromhex(io.recvline().rstrip().decode()))
18 with open(fname, 'wb') as f:
19 f.write(level)
20 return level, fname
21
22def give_answer(ans):
23 io.recvuntil(b'ANSWER: \n')
24 io.sendline(ans)
25
26def find_string_case(fexe):
27 wrong = [b'/lib64/ld-linux-x86-64.so.2', b'__cxa_finalize', b'__libc_start_main', b'__stack_chk_fail',
28 b'GLIBC_2.14', b'GLIBC_2.2.5', b'GLIBC_2.34', b'_ITM_deregisterTMCloneTable', b'__gmon_start__',
29 b'_ITM_registerTMCloneTable', b'GCC: (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0', b'.note.gnu.property',
30 b'.note.gnu.build-id', b'.note.ABI-tag', b'.gnu.version', b'.gnu.version_r', b'.eh_frame_hdr',
31 b'.init_array', b'.fini_array']
32 s = run(['strings', '-n10', fexe], capture_output=True)
33 for line in s.stdout.strip().split(b'\n'):
34 if line not in wrong:
35 return line
36
37def xor_case(dasm):
38 ct = bytearray()
39 split_dasm = dasm.split('\n')[:-1]
40 ctr = 0x12
41 for line in split_dasm:
42 dasm = line.split('\t')
43 if dasm[1] == 'mov' and (f'byte ptr [rbp - {hex(ctr)}]' in dasm[2] or f'byte ptr [rbp - {ctr}]' in dasm[2]):
44 ct += bytes([int(dasm[2].split(', ')[-1], 16)])
45 ctr -= 1
46 if dasm[1] == 'xor' and (('al, 0x' in dasm[2] or 'eax, 0x' in dasm[2]) or any(n in dasm[2] for n in [str(i) for i in range(10)])):
47 key_byte = bytes([int(dasm[2].split(', ')[-1], 16)])
48 return xor(ct, key_byte)
49
50def bitwise_case(dasm):
51 important_bytes = []
52 split_dasm = dasm.split('\n')[:-1]
53 for i in range(len(split_dasm)):
54 dasm = split_dasm[i].split('\t')
55 next_dasm = split_dasm[i+1].split('\t')
56 if dasm[1] == 'mov' and 'edi, 0x' in dasm[2] and next_dasm[1] == 'call':
57 ct_len = int(dasm[2].split(', ')[-1], 16)
58 break
59 pattern = r'\b0x[0-9A-Fa-f]{3,4}\b'
60 for line in split_dasm:
61 dasm = line.split('\t')
62 matches = re.findall(pattern, dasm[2])
63 if dasm[1] == 'mov' and 'word ptr [rbp -' in dasm[2] and not any(register in dasm[2] for register in ['edi', 'rdi', 'esi', 'rsi', 'eax', 'rax']) and matches:
64 important_bytes.append(int(matches[0][2:], 16))
65 processing_data = important_bytes[:ct_len//2]
66 ct = important_bytes[ct_len//2:]
67 return b''.join([((c - d) & 0xffff).to_bytes(2, 'little') for c, d in zip(ct, data)])
68
69def analyze(exe, flvl):
70 with open(flvl, 'rb') as f:
71 elf = ELFFile(f)
72 header = elf.header
73 entry_point = header['e_entry']
74 md = Cs(CS_ARCH_X86, CS_MODE_64)
75 for dasm in md.disasm(exe[entry_point:], entry_point, 0xd):
76 if dasm.mnemonic == 'lea':
77 rip_offset = int(dasm.op_str.split(' + ')[-1][:-1], 16)
78 main_addr = dasm.address + rip_offset + 0x7 # the 7 for the next rip
79 break
80 for dasm in md.disasm(exe[main_addr:], main_addr):
81 if dasm.mnemonic == 'ret':
82 main_end = dasm.address + 0x1
83 break
84 main_disasm = ''
85 for dasm in md.disasm(exe[main_addr:main_end], main_addr):
86 main_disasm += f'{hex(dasm.address)}\t{dasm.mnemonic}\t{dasm.op_str}\n'
87 if 'lea\trcx, [rip + 0x' in main_disasm:
88 solution = find_string_case(flvl)
89 elif 'mov\tbyte ptr [rbp - 0x12],' in main_disasm:
90 solution = xor_case(main_disasm)
91 else:
92 solution = bitwise_case(main_disasm)
93 return solution
94
95def pwn():
96 start_game()
97 for l in range(25):
98 level, level_fname = get_level()
99 answer = analyze(level, level_fname)
100 print(f'Level {l+1}/25: {answer}')
101 give_answer(answer)
102 io.interactive()
103
104if __name__ == '__main__':
105 ip = '07u4-1.play.hfsc.tf'
106 port = 3991
107 io = remote(ip, port)
108 pwn()
And below is the interaction with the server
1[+] Opening connection to 07u4-1.play.hfsc.tf on port 3991: Done
2Level 1/25: b'NrWP8SjeBi'
3Level 2/25: b'vzIGknF3dO'
4Level 3/25: b'6dr5X08rEJ'
5Level 4/25: b'jxkb7XNBuk'
6Level 5/25: b'Rx0INfgqGN39uR3Ujgx38F2E9aKjLeqjRhoiCYOaj5oozcXAfU9WfVUGZjinxYWn9z4AEB6'
7Level 6/25: b'lS5ymzLrD4D9v3Eo8GqcCSikUf5A3OPPjf1l1zBjsrdMMkkp1ueOCHMWboOWM3sJhxMKG3h24ce6'
8Level 7/25: b'vtmjCdpKVStebMRCvruRjDeQzjEluJwNRWtpZaXKhfrnUKhm'
9Level 8/25: b'Gh85JJQtui'
10Level 9/25: b'OvHbRWIgEo'
11Level 10/25: b'OLPtHmEaMcrXMcafCOMCoTdRKSYKuHmUagPCckqTfKdUFKVHtRHOICTzzCWgGSYdkbBaDgbzlRqTWspfezwGuAAAYq'
12Level 11/25: b'J6WCquMUL8'
13Level 12/25: b'ukPgegERvEkZQGxWXvXPkPXxilpxsjQHZFFlCTmCRvvhGWZcxRfYYy'
14Level 13/25: b'Rx0INfgqGN39uR3Ujgx38F2E9aKjLeqjRhoiCYOaj5oozcXAfU9WfVUGZjinxYWn9z4AEB6'
15Level 14/25: b'ldFNsXSpvJiSwZUnqzoLFgOPmopuCKwQWlizkGZaUHuwhccdCHEbOYldDNSCOQjqRZkFnK'
16Level 15/25: b'ooQIANIbxRnbLBPYBsFsOLyaKJ'
17Level 16/25: b'oAUxzMpgEByOZNtMOXgXwvKszthEnetKhjOsOmVmpEmWzdHwQkYScdEctIhkNKnaYTDhBLttLaHRnVvQ'
18Level 17/25: b'IuSwpCgOGVegESwCOEjrmPBolROo'
19Level 18/25: b'ukPgegERvEkZQGxWXvXPkPXxilpxsjQHZFFlCTmCRvvhGWZcxRfYYy'
20Level 19/25: b'HyQnJamR2E'
21Level 20/25: b'FnPZFnDfTjTnjvPWdRDSTQCtygqgHqoMglAqHQOB'
22Level 21/25: b'vYXIaPfZMsAjsfrcjlRoRcgssjxDhPUgcLRKtEBTinuE'
23Level 22/25: b'PMCOuiwONurxpTAGaUNNpIZKZgWKmRCrxRwPbaqSHGsKwzEHZoBKKPSAzfkjuXQaQwAQsFLgOhpiDtsjRNQcTqBhrlwyeznLfv'
24Level 23/25: b'keUjBcmtaRaxWpusZbrDxxFStQbRGRbNkRrrNSvSkWTMVyjClDQQjX'
25Level 24/25: b'eouFJPb9iJ'
26Level 25/25: b'keUjBcmtaRaxWpusZbrDxxFStQbRGRbNkRrrNSvSkWTMVyjClDQQjX'
27[*] Switching to interactive mode
28
29FLAG:
30midnight{Ti_esrever_dna_ti_pilf_nwod_gnaht_ym_tup_i}
And just like that we got the flag!
Conclusion
The challenge was on the easier side, since even though it was released around the middle of the CTF, it still got around 30 solves. However it was still very enjoyable and gave room for many experimentations. Thank you quend, rest in peace.