The ckriellideon ๐Ÿ”ฅ

GCC CTF24 Array Programming Rocks Writeup

Reversing an array programming language

- 16 min

Array Programming Rocks - GCC CTF Writeup

Introduction

This week I played a challenge from GCC CTF, Array Programming Rocks. The core concept is understanding array programming, and reversing a language that implements it.

Understanding What I’m Seeing

We where given a chall.ua file with the below contents

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
F โ† -@\0 &sc
N โ† โงปF
[โฅ(โ‰โ†ป1โ‰.)] N . F
โˆต(โ‡Œโฌš0โ†™8โ‹ฏ)
โฅ(โ—ฟ2+) N /โˆ˜
โฟโ†ฏNโ‡ก8โ‰โ‡Œโ‰ร—2
/+โ‰
โ‰ 5_1_1_57_115_29_15_115_53_113_29_23_43_115_119_29_49_23_119_33_41_29_33_35_119_39_7_29_15_119_39_13_113_119_119_63
โค"Sorry this wasn't the flag ๐Ÿ˜”"
:/+โ‰โ—ซ7:/+โ†ฏ9_4.โ‡Œ 
โ‰ 844_662_647_745
โค"So close but yet it's not the flag ๐Ÿ˜”"
โ‰ 512_464_506_521_571_546_543_589_607_619_650_601_632_650_647_605_547_552_584_595_531_554_549_576_567_532_560_576_525_548
โค"You're soooo close ๐Ÿ˜”"
"Well done ! ๐Ÿฅณ๐Ÿšฉ"

The obvious response is dear me wtf. Obviously we can see some interesting things, like printable strings, big array-like collection of numbers, and even some assignments in the first two lines. However this was not enough to sit and understand the entire code, and I thought that it was probably a real language. So I decided to try and find what the language was. After some thoughts, I just decided to search for a language detection program, and found this site.

After throwing it in, it returns it’s APL, which made me understand the whole array programming connection. I then found this APL interpreter in order to do some dynamic analysis, however it produced some errors. After looking at some other APL challenge writeups, I understood that APL wasn’t exactly the language I wanted. So I decided to try and use a symbol of the language and use that to search. Through this, I found the APL Wiki. Searching inside it for the โ‰ character, it returned 4 search results. Unfortunately because I’m stupid, I followed the wrong page, and attempted to see if the BQN language was what I wanted. After figuring out that wasn’t it as well, I went back, and saw that there exists a language called Uiua.

Searching for some of the symbols produced good results, and that combined with the .ua extension, made me feel confident I had found the correct language. After looking at some general stuff about the language, like for example that it’s stack based, I tried finding an interpreter for it. Thankfully, the official website offers an interpreter in it’s home page. After attempting to run the code and reoslving some issues (Error: Maximum execution time exceeded), I could finally run the code and start reversing it.

Reversing What I’m Seeing - Part 1

So far, we have an array programming stack-based language with it’s own unique symbols and syntax. I followed a more dynamic analysis approach rather than a static one, since learning the language seemed too much, and I had an interpreter. However, whenever I felt I needed to look at a glyph better, I consulted the docs. A very helpful symbol I found was the ? symbol, which let me see what was on the stack at any point in the code’s execution. This means we can place this debug symbol not only in between lines, to understand a line and it’s operations as a whole, but we can also place it inside a line amongst other glyphs. This was very helpful as it let me follow every processing of data in memory. We just need to keep in mind how some instructions are syntaxed so that we don’t break them.

What followed was some hours of me just giving input to the program and trying to understand what was going on. Some things I found early on were that the first line takes the input and stores it as an array into F, that N holds the flag length, and that the flag is of length 36 (by seeing the length of the string in line 8). From now on, assume I’m using GCC{abcdefghijklmnopqrstuvwxyzABCDE} for every input unless stated otherwise. If we enter a debug symbol under line 3, we get this view of the stack

1
2
3
4
5
6
7
8
9
โ”Œโ•ด? 4:1
โ”œโ•ด[71 67 67 123 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 65 66 67 68 69 125]
โ”‚ โ•ญโ”€                                                    
โ”œโ•ดโ•ท  71  67  67 123  97  98  99 100 101 102 103 104 105โ€ฆ
โ”‚   125  71  67  67 123  97  98  99 100 101 102 103 104โ€ฆ
โ”‚    69 125  71  67  67 123  97  98  99 100 101 102 103โ€ฆ
โ”‚    68  69 125  71  67  67 123  97  98  99 100 101 102โ€ฆ
โ”‚    67  68  69 125  71  67  67 123  97  98  99 100 101โ€ฆ
โ”‚    66  67  68  69 125  71  67  67 123  97  98  99 100โ€ฆ

So line 3 takes the flag array, and creates a rotation of it for every character, rotating from end to start. The same for line 4, it takes every one of these lines, and just converts every number to their binary representation

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
โ”Œโ•ด? 5:1
โ”œโ•ด[71 67 67 123 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 65 66 67 68 69 125]
โ”‚ โ•ญโ”€                 
โ”œโ•ดโ•ท 0 1 0 0 0 1 1 1  
โ”‚ โ•ท 0 1 0 0 0 0 1 1  
โ”‚   0 1 0 0 0 0 1 1  
โ”‚   0 1 1 1 1 0 1 1  
โ”‚   0 1 1 0 0 0 0 1  
โ”‚   0 1 1 0 0 0 1 0  
โ”‚   0 1 1 0 0 0 1 1  
โ”‚   0 1 1 0 0 1 0 0  
โ”‚   0 1 1 0 0 1 0 1  
โ”‚   0 1 1 0 0 1 1 0  
โ”‚   0 1 1 0 0 1 1 1  
โ”‚   0 1 1 0 1 0 0 0  
โ”‚   0 1 1 0 1 0 0 1  
โ”‚   0 1 1 0 1 0 1 0  
...

The next line took me some time to understand what exaclty was going on. For better comprehension, I gave a smaller input, ABCD

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
โ”Œโ•ด? 5:1
โ”œโ•ด[65 66 67 68]
โ”‚ โ•ญโ”€                 
โ”œโ•ดโ•ท 0 1 0 0 0 0 0 1  
โ”‚ โ•ท 0 1 0 0 0 0 1 0  
โ”‚   0 1 0 0 0 0 1 1  
โ”‚   0 1 0 0 0 1 0 0  
โ”‚                    
โ”‚   0 1 0 0 0 1 0 0  
โ”‚   0 1 0 0 0 0 0 1  
โ”‚   0 1 0 0 0 0 1 0  
โ”‚   0 1 0 0 0 0 1 1  
โ”‚                    
โ”‚   0 1 0 0 0 0 1 1  
โ”‚   0 1 0 0 0 1 0 0  
โ”‚   0 1 0 0 0 0 0 1  
โ”‚   0 1 0 0 0 0 1 0  
โ”‚                    
โ”‚   0 1 0 0 0 0 1 0  
โ”‚   0 1 0 0 0 0 1 1  
โ”‚   0 1 0 0 0 1 0 0  
โ”‚   0 1 0 0 0 0 0 1  
โ”‚                    
โ”‚   0 1 0 0 0 0 0 1  
โ”‚   0 1 0 0 0 0 1 0  
โ”‚   0 1 0 0 0 0 1 1  
โ”‚   0 1 0 0 0 1 0 0  
โ”‚                   โ•ฏ
โ””โ•ดโ•ดโ•ดโ•ดโ•ดโ•ด
โ”Œโ•ด? 7:1
โ”œโ•ด[65 66 67 68]
โ”‚ โ•ญโ”€                 
โ”œโ•ดโ•ท 0 1 0 0 0 1 0 1  
โ”‚   0 1 0 0 0 1 1 0  
โ”‚   0 1 0 0 0 1 1 1  
โ”‚   0 1 0 0 0 0 0 0  
โ”‚                   โ•ฏ

I also got another sample using ABCDE, to do some differential analysis of the outputs

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
โ”Œโ•ด? 5:1
โ”œโ•ด[65 66 67 68 69]
โ”‚ โ•ญโ”€                 
โ”œโ•ดโ•ท 0 1 0 0 0 0 0 1  
โ”‚ โ•ท 0 1 0 0 0 0 1 0  
โ”‚   0 1 0 0 0 0 1 1  
โ”‚   0 1 0 0 0 1 0 0  
โ”‚   0 1 0 0 0 1 0 1  
โ”‚                    
โ”‚   0 1 0 0 0 1 0 1  
โ”‚   0 1 0 0 0 0 0 1  
โ”‚   0 1 0 0 0 0 1 0  
โ”‚   0 1 0 0 0 0 1 1  
โ”‚   0 1 0 0 0 1 0 0  
โ”‚                    
โ”‚   0 1 0 0 0 1 0 0  
โ”‚   0 1 0 0 0 1 0 1  
โ”‚   0 1 0 0 0 0 0 1  
โ”‚   0 1 0 0 0 0 1 0  
โ”‚   0 1 0 0 0 0 1 1  
โ”‚                    
โ”‚   0 1 0 0 0 0 1 1  
โ”‚   0 1 0 0 0 1 0 0  
โ”‚   0 1 0 0 0 1 0 1  
โ”‚   0 1 0 0 0 0 0 1  
โ”‚   0 1 0 0 0 0 1 0  
โ”‚                    
โ”‚   0 1 0 0 0 0 1 0  
โ”‚   0 1 0 0 0 0 1 1  
โ”‚   0 1 0 0 0 1 0 0  
โ”‚   0 1 0 0 0 1 0 1  
โ”‚   0 1 0 0 0 0 0 1  
โ”‚                    
โ”‚   0 1 0 0 0 0 0 1  
โ”‚   0 1 0 0 0 0 1 0  
โ”‚   0 1 0 0 0 0 1 1  
โ”‚   0 1 0 0 0 1 0 0  
โ”‚   0 1 0 0 0 1 0 1  
โ”‚                   โ•ฏ
โ””โ•ดโ•ดโ•ดโ•ดโ•ดโ•ด
โ”Œโ•ด? 7:1
โ”œโ•ด[65 66 67 68 69]
โ”‚ โ•ญโ”€                 
โ”œโ•ดโ•ท 0 0 0 0 0 0 0 0  
โ”‚   0 0 0 0 0 0 1 1  
โ”‚   0 0 0 0 0 0 1 0  
โ”‚   0 0 0 0 0 1 0 1  
โ”‚   0 0 0 0 0 1 0 0  
โ”‚                   โ•ฏ
โ””โ•ดโ•ดโ•ดโ•ดโ•ดโ•ด

If we stare at these numbers for long enough, they start talking back to us, and they tell us they are the result of XORing between each bit of the number in the corresponding index. If you want to see it for yourself, look at them and talk to them, they don’t bite (a lot).

The next line is simple enough, just convert the bits to their corresponding power of 2 based on their index, and then line 7 just takes every row, and sums the numbers in it. The results of all these rows is compared in the first assertion. Throughout all of this, I implemented these steps in python to try and understand them, and also in the hopes of understanding how to reverse them

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
from math import log2

def get_linear_perms(txt):
    variations = [txt]
    for i in range(len(txt)):
        txt = txt[-1] + txt[:-1]
        variations.append(txt)
    binary_variations = []
    for var in variations:
        binary_variations.append(['{:08b}'.format(ord(i)) for i in var])
    return binary_variations

def get_new_nums(nums):
    new_nums = []
    for i in range(len(nums)-1):
        res = ''
        for j in range(8):
            bit = 0
            for k in range(len(nums)):
                #print(k, j, i, nums[k][i][j])
                bit ^= int(nums[k][i][j])
            res += str(bit)
        res = list(res)
        res[-1] = '1'
        res = ''.join(res)
        new_nums.append(res)
    return new_nums

def get_num_sum_powers(nums):
    totals = []
    powers = [pow(2, i) for i in range(8)]
    for n in nums:
        total = 0
        n = n[::-1]
        for i in range(len(n)):
            total += int(n[i]) * powers[i]
            #print(int(n[i]), powers[i])
        totals.append(total)
    print(totals)
    return totals

nums = get_linear_perms('GCC{abcdefghijklmnopqrstuvwxyzABCDE}')
new_nums = get_new_nums(nums)
print(new_nums)
sums = get_num_sum_powers(new_nums)
print(sums)

Generally looking at this, I couldn’t see how to efficiently reverse this step. I couldn’t go through an alphabet to find matches with the checked string, and the only logic I could find was maybe z3, but that would need to happen on a bit level. So maybe potentially something with sagemath? Alas, I decided to see if the next checks where enough to give me something to work with.

Reversing What I’m Seeing - Part 2

The first part was relatively easy to reverse, since the operations could be treated as a whole function with us just doing F8 over it (not literally, debugging joke). Line 10 however needed more effort, for me at least. And I made some mistakes which didn’t help me and ate away at my time and energy:

  1. When the assertion happens, the value that is being compared is popped from the stack after the check. So while I thought that the second part used the array that was used in the first check, it turned out it uses the original input, which makes it way more approachable than what I had initially thought
  2. The instructions are executed in reverse order. So while I thought that these were the first instructions that happen :/+โ‰, it’s in fact these โ‡Œ.โ†ฏ9_4, and in this order. So the language is read left to right. This slipped from me during the first step because I treated the lines as a whole

Line 10 can be split into two ideas, two big operations that are happening, which create the arrays which will be used in the next two checks:

  1. It takes our input, creates a 9x4 matrix out of it, and sums the elements of the columns. That’s how we end up with 4 elements, which are used in the first check
  2. Like the first part, it takes our array, and rotates it every 7 characters, starting from the beginning. It then transposes the generated matrix, so that it can take the sums of the columns, but it’s the same as if it took the sums of the rows of the matrix Below is the stack view after both operations have happened (note that I commented out the assert, so that’s why it’s working on the array generated during the first part)
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
โ”Œโ•ด? 10:11
โ”œโ•ด[71 67 67 123 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 65 66 67 68 69 125]
โ”œโ•ด[103 95 95 89 89 91 97 99 99 109 109 111 111 105 105 107 107 117 117 119 119 113 113 115 115 125 125 127 127 121 121 123 97 89 89 93]
โ”‚ โ•ญโ”€                 
โ”œโ•ดโ•ท 103  95  95  89  
โ”‚    89  91  97  99  
โ”‚    99 109 109 111  
โ”‚   111 105 105 107  
โ”‚   107 117 117 119  
โ”‚   119 113 113 115  
โ”‚   115 125 125 127  
โ”‚   127 121 121 123  
โ”‚    97  89  89  93  
โ”‚                   โ•ฏ
โ””โ•ดโ•ดโ•ดโ•ดโ•ดโ•ดโ•ดโ•ด
โ”Œโ•ด? 10:4
โ”œโ•ด[71 67 67 123 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 65 66 67 68 69 125]
โ”œโ•ด[967 965 971 983]
โ”‚ โ•ญโ”€                                                    
โ”œโ•ดโ•ท 103 95 95  89  89  91  97  99  99 109 109 111 111 1โ€ฆ
โ”‚    95 95 89  89  91  97  99  99 109 109 111 111 105 1โ€ฆ
โ”‚    95 89 89  91  97  99  99 109 109 111 111 105 105 1โ€ฆ
โ”‚    89 89 91  97  99  99 109 109 111 111 105 105 107 1โ€ฆ
โ”‚    89 91 97  99  99 109 109 111 111 105 105 107 107 1โ€ฆ
โ”‚    91 97 99  99 109 109 111 111 105 105 107 107 117 1โ€ฆ
โ”‚    97 99 99 109 109 111 111 105 105 107 107 117 117 1โ€ฆ
โ”‚                                                       
โ””โ•ดโ•ดโ•ดโ•ดโ•ดโ•ดโ•ด
[71 67 67 123 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 65 66 67 68 69 125]
[659 655 659 673 693 715 735 743 749 757 755 763 769 777 791 799 805 813 811 819 825 833 847 855 861 869 841 805 767 733]
[967 965 971 983]

This part seems much more approachable, and my instinct was telling me z3 would do the job for me. So that’s what I did.

Solving What I’m Seeing

All in all, I just tried to reproduce the two operations. However I made a big mistake while solving it. Usually in my z3 solvers I write

1
2
3
4
5
if s.check() == unsat:
    print('sad')
else:
    m = s.model()
    ...

However for a simple sanity check, I wrote just print(s.check() == unsat), expecting to see True if the model was satisfiable. This however, was a mistake of mine, since this line means that it will print False if the model is satisfiable. However I hadn’t thought about that. As a result, I would see False being printed, and take that as an indication that the model failed. I had to talk with a member of the Greek CTF community who had solved the challenge, Babafaba (who had also used the same z3 approach), to understand my stupidity. Thanks Babafaba :)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
from z3 import *

flag = [BitVec(f'n{i}', 32) for i in range(36)]

s = Solver()

for c in flag:
    s.add(c >= 0x30)
    s.add(c < 0x7f)

s.add(flag[0] == ord('G'))
s.add(flag[1] == ord('C'))
s.add(flag[2] == ord('C'))
s.add(flag[3] == ord('{'))
s.add(flag[-1] == ord('}'))

flag_rev = flag[::-1]
flag_matrix = []
for i in range(0, 36, 4):
    flag_matrix.append(flag_rev[i:i+4])

# didn't do it with loops because at that moment I was tired as shit and couldn't bother figuring it out
s.add(flag_matrix[0][0] + flag_matrix[1][0] + flag_matrix[2][0] + flag_matrix[3][0] + flag_matrix[4][0] + flag_matrix[5][0] + flag_matrix[6][0] + flag_matrix[7][0] + flag_matrix[8][0] == 844)
s.add(flag_matrix[0][1] + flag_matrix[1][1] + flag_matrix[2][1] + flag_matrix[3][1] + flag_matrix[4][1] + flag_matrix[5][1] + flag_matrix[6][1] + flag_matrix[7][1] + flag_matrix[8][1] == 662)
s.add(flag_matrix[0][2] + flag_matrix[1][2] + flag_matrix[2][2] + flag_matrix[3][2] + flag_matrix[4][2] + flag_matrix[5][2] + flag_matrix[6][2] + flag_matrix[7][2] + flag_matrix[8][2] == 647)
s.add(flag_matrix[0][3] + flag_matrix[1][3] + flag_matrix[2][3] + flag_matrix[3][3] + flag_matrix[4][3] + flag_matrix[5][3] + flag_matrix[6][3] + flag_matrix[7][3] + flag_matrix[8][3] == 745)

target2 = [512, 464, 506, 521, 571, 546, 543, 589, 607, 619, 650, 601, 632, 650, 647, 605, 547, 552, 584, 595, 531, 554, 549, 576, 567, 532, 560, 576, 525, 548]
ctr_flag = 0
for ctr_trgt in range(len(target2)):
    s.add(flag_rev[ctr_flag+0] + flag_rev[ctr_flag+1] + flag_rev[ctr_flag+2] + flag_rev[ctr_flag+3] + flag_rev[ctr_flag+4] + flag_rev[ctr_flag+5] + flag_rev[ctr_flag+6] == target2[ctr_trgt])
    ctr_flag += 1

m = s.model()
for c in flag:
    print(chr(int(str(m[c]))), end='')

GCC{1_L0v3_Th15_sT4ck_ba5eD_M4dN355}

Conclusion

After the CTF ended, the organizers released a repo containing the writeup of the challenge. It tells us that for the first part, we needed to see that the flag is XORed twice. That’s true. However I don’t know how many people used that to solve this challenge. Another player seems to have used the same z3 approach, so I think we outplayed the developer bros. Jokes aside, it was a very interesting challenge, and it could help someone understand some of the core concepts of reversing, like guessing, finding out how to view the memory of a language’s runtime, guessing, how to understand a language and concepts on the go, guessing, and many more (like guessing). I made different mistakes and sacrificed a bit of my sleep, but it was an enjoyable challenge with things to teach, so kudos to the author!