SSTIC is a famous French security conference. Each year the organizer proposes a security challenge for anyone who is interested on security. This year it was a reverse engineering task. The goal of this challenge was to find an email address such as … @ challenge.sstic.org from a USB trace. I would like to present my solution (although it’s not the best one). This post is a bit long and it explains my approach to solve this challenge. In total, I’ve spent almost two weeks to have the challenge done (I was stuck for about one week on a tricky spot).
The solution can be divided into three parts:
Analyze USB trace The USB trace was captured while an Android phone was connected to an air-gapped PC. The first step was to parse the trace and understand what has happened. Actually a ELF binary for ARM 64 bit architecture, has been sent from the Android phone to PC.
Reverse ELF ARM64 The second step was to understand the ELF file ARM64, which read a decryption key from input and generated a file named payload.bib if the conditions were checked
Exploit the remote microcontroller Once obtained the correct decryption key, it turned out that payload.bin was indeed a ZIP archive, which contained a Python script. This script was used to update the firmware of a remote micro controller. This micro controller contained a secret memory area that contained the email address.
The challenge is still online and can be downloaded at this link. The file command indicated the file’s MIME type conforms to its original .xz extension, which is an archive of XZ compression. The xz decompressed without error the archive. The unzipped file contained only Unicode texts.
Look closer at the first 20 lines of this file. There was a small message in French that means the USB trace was captured while an Android phone was connected to an air-gapped PC. A malware may infect the Android phone, can we figure out what has happend?.
I did not know great things about the USB protocol. After some searches on the Internet, it turned out that the trace was generated with the Linux Module usbmon. Thanks to this document, I could understand the trace now. Take the first line of the trace as an example:
I tried first to parse the USB trace with tool [usbmon-parse]* (https://gitorious.org/usbmon-parser/usbmon-parser). But it didn’t help a lot.
The following is the results returned by usemon-parser:
In fact, usbmon-parser analyzed only the USB protocol. However here the trace is actually an exchange happened between a PC and an Android phone, which used the Android Debug Bridge Protocol (ADB) over USB protocol. Therefore, the ADB protocol could not be analyzed by this tool. The following Python script can parse line by line of the USB trace.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
import datetime
class Pkt():
def __init__(self, s):
c = s.split()
self.urb = c[0]
self.ts = c[1]
self.ev = c[2]
self.adw = c[3]
self.bus = self.adw.split(":")[2]
self.endpoint = self.adw.split(":")[3]
self.ust = c[4]
self.ln = c[5]
if len(c) > 6:
self.tag = c[6]
self.data = c[7:]
else:
self.tag = 0
self.data = 0
def show_adw(self):
a = self.adw.split(":")[0]
if a == "Ci":
print "Control input"
if a == "Co":
print "Control output"
if a == "Zi":
print "Isochronous input"
if a == "Zo":
print "Isochronous output"
if a == "Ii":
print "Interrupt input"
if a == "Io":
print "Interrupt output"
if a == "Bi":
print "Bulk input"
if a == "Bo":
print "Bulk output"
def __str__(self):
sheader = 1
if sheader:
print "URB:%s" % self.urb
print "ts:%s" % str(datetime.datetime.\
fromtimestamp(int(self.ts)).\
strftime('%Y-%m-%d %H:%M:%S'))
print "ev:%s" % self.ev
self.show_adw()
print "ust:%s" % self.ust
print "ln:%s" % self.ln
if self.tag == "=":
data = str(self.data)
return str(self.data)
To better understand the traces, the following Python code displayed all traces that had data. Nevertheless, the ADB protocol was still not analyzed.
I’ve found a very good documentation about ADB. 6 standard packets are defined in the ADB protocol.
After have added these definition to the previous python parser, the following commands appeared.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
shell:id
uid=2000(shell) gid=2000(shell) groups=1003(graphics),
1004(input),1007(log),1009(mount),1011(adb),1015(sdcard_rw)
,1028(sdcard_r),3001(net_bt_admin),3002(net_bt),3003(inet)
,3006(net_bw_stats) context=u:r:shell:s0
shell:uname -a
Linux localhost 4.1.0-g4e972ee #1 SMP PREEMPT Mon Feb 24
21:16:40 PST 2015 armv8l GNU/Linux
LIST /sdcard/
.
..
Samsung Android .face Music Podcasts Ringtones Alarms
Notifications Pictures Movies Download DCIM Documents .SPenSDK30 .enref Nearby Playlists .plaDENT .estrongs
backups clockworkmod CyanogenMod mmc1
LIST /sdcard/Documents/
.
..
CSW-2014-Hacking-9.11_uncensored.pdf
NATO_Cosmic_Top_Secret.gpg
LIST /data/local/tmp
.
..
SEND /data/local/tmp/badbios.bin, 33261DATA...
shell:chmod 777 /data/local/tmp/badbios.bin
LIST /data/local/tmp
.
..
badbios.bin
The malware has displayed folders /sdcard/, /sdcard/Documents/and /data/local/tmp. Line 28 to 32 show that a badbiso.bin file was sent to the phone. Obviously, this was the only interesting to be investigated.
In the same blog, there is a detailed explanation about sending a file via ADB (ADB Push). The sender begins by a command Sync, followed by the command SENDnnnn where nnnn indicates the size of filename. Then the file name and the file mode (here, 3320”) are sent. Then one or more commands DATAnnnn carry data where nnnn indicates the size of data sent. Finally, the command DONEnnnn completes the sending process and modifies the file’s timestamps.
With the above information, I could retrieve the exchanged file badbios.bin using the following script.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
badbios = open("badbios.bin", "wb")
start_dump = False
stop_dump = False
target_device = ''
payload = ''
nb_data_pkt = 0
for line in lines:
packet = Pkt(line)
if packet.tag == "=":
data = packet.data
data = "".join([ byte.decode("hex") for byte in data ])
if 'DATA' in data:
start_dump = True
target_device = packet.adw
if start_dump:
if 'DONE' in data:
start_dump = False
stop_dump = True
if start_dump and (packet.adw == target_device):
if 'WRTE' not in data:
i = data.find('DATA')
if i != -1:
nb_data_pkt += 1
if nb_data_pkt > 1:
payload += data[0:i]
payload += data[i+8:]
else:
payload += data
elif stop_dump and (packet.adw == target_device):
if 'WRTE' not in data:
i = data.find('DONE')
payload += data[0:i]
break
badbios.write(payload)
badbios.close()
I used readelf to check the retrieved file and it looked like correct.
The ARM64 instruction set manual was a useful tool too. It’s important to noter that ARM instruction set does not support mov memomry, memomry. Thus, all memory operations must go through the registers. This feature allowed me to define my reversing strategy: trace all changes of interesting registers.
After open this file in IDA, I found that it was difficult to do a static analysis. So I used QEMU to trace its syscall.
There was an error message Wrong key format. Just leave behind it now, because we don’t have any information about it (we will look into it later). We can see that several memory zones were created with mmap. Particularly, the memory from 0x400000 to 0x500000 included new code. QEMU can also trace basic blocs of badbiso.bin. The following QEMU log showed that code execution changed from 0x00102c0 to 0x0400514, that proved new code has been written to this memory zone.
I dumped the memory with GDB by using the following commands:
I added the additional file memelf.400000 in IDA, more functions has appeared. It was actually a virtual machine. I needed to reverse the instructions of this virtual machine.
While debugging Manually in GDB, I found a table that contained all instructions.
Indeed, all instructions were not used. The following table presents some instructions.
Address | Instruction |
---|---|
0x00401794 | condition check |
0x00400d24 | increment |
0x00400ce0 | decrement |
0x00401580 | push w2 |
0x004008c4 | convert ASCII to hexadecimal value (sub) |
0x00401490 | syscall (mmap, write, read, open, etc) |
0x00400b04 | shift right |
0x00400b78 | shift left |
0x00400bd0 | logical and |
0x00400c20 | logical or |
0x00400c90 | exclusive or |
Since all memory changes must go through the registers, a complete log of register changes would reveal the mechanism and functionality of the binary. After reversed the virtual machine’s opcodes, I traced every used opcode wiht GDB.
The following script is used to log all register changes in order to understand the binary file. The logs are stored in file registers.log. This script required a few hours to finish, that is obviously the weakness of this method. But once completed, understanding the binary file became easier and straightforward as this script logged the operations of the virtual machine.
Let’s go back to the first error message Wrong key format. In order to understand what happened, I just analyzed the log obtained previously. In the following example, the letter “A” has been entered (line 1), then it checked if “A”> = “0” (line 5), “A”> = “9” (line 11), “A” <= “A” <= “F” (line 17 and 23). This observation indicates each letter of the key must be among 0123456789ABCDEF. After the format check of one letter, a counter, which had the initial value of 0x10 (line 32), was decrement by one until that its value reached 1. So the key size was 16.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
store w2 = 0x41 ; input "A"
call func 0x401580
store w2 = 0x41
call func 0x4008c4
convert str to hex 0x41 - 0x30 = 0x11 ;if "A" >= "0"
call func 0x401794
x19<f:d> (0x2b48208) = 0x0
call func 0x401580
store w2 = 0x41
call func 0x4008c4
convert str to hex 0x41 - 0x39 = 0x8 ;if "A" >= "9"
call func 0x401794
x19<f:d> (0x106c208) = 0x0
call func 0x401580
store w2 = 0x41
call func 0x4008c4
convert str to hex 0x41 - 0x41 = 0x0 ;if "A" >= "A"
call func 0x401794
x19<f:d> (0x2b48208) = 0x0
call func 0x401580
store w2 = 0x41
call func 0x4008c4
convert str to hex 0x41 - 0x46 = 0xfffffffb ;if "A" <= "F"
call func 0x401794
x19<f:d> (0x2b4a208) = 0x0
call func 0x4008c4
convert str to hex 0x41 - 0x41 = 0x0
load 0x4000801000 -> 0xFC
call func 0x400918
add 0xa + 0x0 = 0xa ;hex("A") = 0xa
...
convert str to hex 0x10 - 0x10 = 0x0
convert str to hex 0x10 - 0xf = 0x1
...
convert str to hex 0x10 - 0x1 = 0xf
Now the key format is known, a new error occurred: Invalid padding. I supposed that the binary badbios.bin used an encryption algorithm (to find). While decrypting, the padding was checked.
By analyzing the logs that lied after the check of key format, a loop of operations on the input key has been identified. In this example, the input key was A161AC7794260DCC. The input key was stored in two parts in little endian format (line 1 and 2). The inner-loop of the encryption algorithm was actually a linear-feedback shift register (LFSR, line 3 to 14) that was reversible. The LFSR implemented a custom check condition (line 3 to 6) based on the current key. Then it shifted to left the current key by one bit. If the previous condition was checked, it set the left-most bit to one, otherwise zero. The inner-loop shifted key value by 8 bits (line 15 to 23). These bits in a reversed order constituted one byte of the decryption key (line 17 to 19) and was xored with one encrypted byte (line 25). Finally the outer-loop repeated 0x2000 times the inner-loop (line 26 to 30).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
store w2 = 0x77ac61a1 //key1 of input key = "A161AC7794260DCC"
store w2 = 0xcc0d2694 //key2 second part of the input key
and 0xb0000000 with 0x77ac61a1 -> 0x30000000 //key1 & 0xb0000000
and 0x1 with 0xcc0d2694 -> 0x0 //key2 & 0x1
xor 0x0 ^ 0x30000000 = 0x30000000 //flat = (keyl & 0xb0000000) ^ (key2 & 0x1)
store w2 = 0x0 //check if need to add 1 at the left most position
store w2 = 0x77ac61a1 //load key1
and 0x1 with 0x77ac61a1 -> 0x1 //key1 & 0x1
shift left 0x1 by 0x1f -> 0x80000000
shift right 0xcc0d2694 by 0x1 -> 0x6606934a //key2 = key2 >> 1
or 0x80000000 with 0x6606934a -> 0xe606934a //if key1 & 0x1, (key2 >> 1) or 0x80000000
shift right 0x77ac61a1 by 0x1 -> 0x3bd630d0 //key1 = key1 >> 1
shift left 0x0 by 0x1f -> 0x0
or 0x0 with 0x3bd630d0 -> 0x3bd630d0 //if is 1, set 1 to left most bit of key1
substract 0x8 by 1 -> 0x7 //decrement inner-loop counter
store w2 = 0xe606934a //key2
and 0x1 with 0xe606934a -> 0x0 //key2 & 0x1
shift left 0x0 by 0x7 -> 0x0 //(key2 & 0x1) << 7
or 0x0 with 0x0 -> 0x0 //change last_byte_key2[0] and last_byte_key2[7]
...
substract 0x6 by 1 -> 0x5 //decrement inner-loop counter
...
substract 0x1 by 1 -> 0x0 //decrement inner-loop counter
store w2 = 0x0 //load encrypted[0]
xor 0x52 ^ 0x0 = 0x52 //xor key_byte ^ encrypted[0]
convert str to hex 0x2000 - 0x0 = 0x2000 //decrement outer-loop counter
...
convert str to hex 0x2000 - 0x1 = 0x1fff //decrement outer-loop counter
...
convert str to hex 0x2000 - 0x2000 = 0x0 //decrement outer-loop counter
The decryption algorithm can be summarized as the pseudo code shown below.
In summary, 0x2000 bytes were generated from the input key. Then an exclusive or was applied to each generated byte and the corresponding encrypted byte, that produced the decrypted payload at the end.
The error Invalid padding logically came after the decryption code. I searched in the end of GDB logs in order to find the padding checking routine. The following log shows one possibility of the padding check.
1
2
3
4
5
6
7
8
9
store w2 = 0x2 //load encrypted[0x1fff]
xor 0x82 ^ 0x2 = 0x80 //xor encrypted[0x1fff], key_byte[0x1fff]
...
convert str to hex 0x2000 - 0x2000 = 0x0 //decrement global counter
...
store w2 = 0x80 //load decrypted[0x1fff]
call func 0x401794 //check if decrypted[0x1fff] != 0x0
convert str to hex 0x80 - 0x80 = 0x0 //if decrypted[0x1fff] - 0x80 != 0, Invalid padding
...
The decryption algorithm checked whether the last decrypted byte was equal to 0x0 or 0x80. Otherwise the error message Invalid padding was displayed. If the last byte was 0x0, it checked the byte before, until that one byte was not equal to 0x0. Then, if this byte was equal to 0x80, a payload.bin file would be created. Otherwise, “Invalid padding” would also be displayed. Thus the padding used was a 0x80 padding that began with 0x80 and followed by some 0x0 if necessary.
Since the encryption algorithm used LFSR, the operations on the input key did not lose any information. It was possible to recover the original key in the condition that the last 8 generated bytes were known. In order to get the generated bytes from the input key, I could apply an exclusive or with the last 8 original and encrypted bytes. The GDB logs contained all the encrypted bytes. So this condition was easy checked. I needed to obtain the last 8 original bytes. The 0x80 padding gave me several possibilities that depended on the position of 0x80. I supposed the last 8 original bytes should be fixed, otherwise there would be multiple possibilities for next step. This meant that the size of padding bytes was at least 8. Keeping that in mind, only two possibilities could exist: 0x8000000000000000 or 0x0000000000000000. By testing both possibilities, the good key was found. The following Python script was used to find the right key.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
$ cat reverse_key.py
import struct
encrypted = ''
#encryption implementation
def keygen(key1, key2, nb_loop):
key1cp = struct.pack('<Q', key1).encode('hex').upper()[0:8]
key2cp = struct.pack('<Q', key2).encode('hex').upper()[0:8]
keycp = key1cp+key2cp
for j in range(nb_loop):
value = 0x0
for i in range(8):
key1_odd = key1 & 0x1
key2_odd = key2 & 0x1
#check if need to add 1 at the left most position
key1_b = key1 & 0xb0000000
key1_id = key1_b ^ key2_odd
key1_id = key1_id ^ (key1_id >> 1)
key1_id = key1_id ^ (key1_id >> 2)
key1_id = key1_id & 0x11111111
key1_id = (key1_id & 0x1) + key1_id * 0x11111111
key1_id = key1_id & 0x10000000
if key1_id == 0x10000000:
key1_id = 1
else:
key1_id = 0
#end check if need to add 1 at the left most position
#shift left by one bit
key1 = key1 >> 1
key2 = key2 >> 1
key1 = key1 | (key1_id << 0x1f)
key2 = key2 | (key1_odd << 0x1f)
value |= ((key2 & 0x1) << (7-i))
print keycp
def reverse_key(msg_hex, encrypted_hex, nb_loop):
values = [ encrypted_hex[i] ^ msg_hex[i] for i in range(len(msg_hex)) ]
key = 0
for idx, value in enumerate(values):
tmp = 0
for i in range(8):
tmp |= ((value >> i) & 1) << (7-i)
key |= (tmp << ((idx)*8))
key = key & 0xffffffffffffffff
for i in range(nb_loop):
ored = key & (1 << 63)
key1_id = key & (0xb << 59)
odd = 0
if ored == (1 << 63):
if key1_id == (3 << 59) or key1_id == 0x0 or key1_id == (9 << 59) or key1_id == (10 << 59):
odd = 1
else:
if key1_id == (1 << 59) or key1_id == (2 << 59) or key1_id == (8 << 59) or key1_id == (11 << 59):
odd = 1
key = key << 1
key = key & 0xffffffffffffffff
key = key | odd
keygen(key>>32, (key&0xffffffff), 0x2000)
msg_hex = [ 0, 0, 0, 0, 0, 0, 0, 0 ]
#dumped last 8 encrypted bytes
encrypted_hex = [ 0x6a, 0xb6, 0x54, 0xc3, 0xca, 0x8f, 0x53, 0x2]
i=0x1fff-7
nb_loop = i*8+1
reverse_key(msg_hex, encrypted_hex, nb_loop)
$ python2 reverse_key.py
0BADB10515DEAD11
$ qemu-aarch64 ./badbios.bin
:: Please enter the decryption key: 0BADB10515DEAD11
:: Trying to decrypt payload...
:: Decrypted payload written to payload.bin.
$ file payload.bin
payload.bin: Zip archive data, at least v2.0 to extract
$ unzip -l payload.bin [0]
Archive: payload.bin
Length Date Time Name
--------- ---------- ----- ----
1247 2014-04-16 15:45 mcu/upload.py
1323 2014-04-17 11:00 mcu/fw.hex
--------- -------
2570 2 files
The payload.bin generated with input key 0BADB10515DEAD11 key was a ZIP archive. Unzip -l showed that this archive contained two files upload.py and fw.hex.
As its name indicated, upload.py read the contents of fw.hex and sent it to a remote microcontroller. The latter executed the received data as its firmware.
The execution of upload.py displayed the firmware version and the number of CPU cycles used. The upload.py script had nothing secret. The secret email address could be hidden in:
fw.hex
micro-controller’s secret memory zone
This file was indeed in format Intel Hex. The following is the example of the first line in fw.hex.
The following Python code parsed fw.hex and retrieved the payload inside.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
$ cat parser.py
import sys
def bit_complement(integer):
out=0
for i in range(8):
bit = (integer >> i) & 0x1
if bit == 1:
bit = 0
else:
bit = 1
out |= bit << i
return out
class Cmd():
def __init__(self):
self.nb_bytes = 0
self.offset1 = 0
self.offset2 = 0
self.payload = []
self.chksum = 0
def readfromline(self, line):
self.nb_bytes = int(line[1:3].decode('utf-8'),16)
self.offset1 = int(line[3:5].decode('utf-8'),16)
self.offset2 = int(line[5:7].decode('utf-8'),16)
self.payload = [int(line[9+i:9+i+2].decode('utf-8'),16) for i in range(0, 2*self.nb_bytes, 2)]
self.chksum = int(line[9+self.nb_bytes*2:9+self.nb_bytes*2+2], 16)
def __str__(self):
#print "nb_bytes:0x%X"% self.nb_bytes
#print "offset:0x%X%X"% (self.offset1, self.offset2)
return str(hex(self.chksum))
def getpayload(self):
return bytes(self.payload)
def genchksum(data):
chksum = sum(data)
chksum %= 256
chksum = bit_complement(chksum) + 1
chksum &= 0xff
return chksum
if __name__ == "__main__":
if len(sys.argv) != 3:
print('Usage: python3 parser.py fireware.hex outputfile')
sys.exit(0)
payload = b''
with open(sys.argv[1], 'rb') as fireware:
lines = fireware.readlines()
cmd = Cmd()
nb_lines = len(lines)
for idx, line in enumerate(lines):
if idx < nb_lines - 1:
cmd.readfromline(line)
if cmd.chksum != genchksum([cmd.nb_bytes,cmd.offset1,cmd.offset2] + cmd.payload):
print('Bad checksum at line %d\n'%(idx+1))
break
payload += cmd.getpayload()
with open(sys.argv[2], 'wb') as output:
output.write(payload)
print('%d of lines read, payload writted to %s'%(idx, sys.argv[2]))
$python parser.py fw.hex fw.payload
$hexdump -C fw.payload
...
00000170 18 30 68 82 f8 03 54 44 a7 de d0 0f 59 65 61 68 |.0h...TD....Yeah|
00000180 52 69 73 63 49 73 47 6f 6f 64 21 00 46 69 72 6d |RiscIsGood!.Firm|
00000190 77 61 72 65 20 76 31 2e 33 33 2e 37 20 73 74 61 |ware v1.33.7 sta|
000001a0 72 74 69 6e 67 2e 0a 00 48 61 6c 74 69 6e 67 2e |rting...Halting.|
...
Firmware version and the word Halting reappeared. It was therefore logical to assume that the remote micro-controller has displayed these messages. But the content of fw.payload remained most unknown. By sending only two bytes to the remote micro-controller, it returned an exception with the register values.
By sending manually different values and combining the returned register values, I analyzed most opcodes of the micro-controller. All opcodes had a size of two bytes: the first byte was instruction, the second is operand. The examples below show some of the opcodes. Note that all opcodes are not known, but they were sufficient to understand the content of fw.hex.
The following Python code was used to read the fw.payload file and generate its assembler instructions.
For the purpose of brevity, the full assembler instructions of fw.payload is in Appendix. Its payload contained codes that apply CR4 algorithm to display messages as we have seen before. There were 3 syscalls, print, exit and a special one, that wrote the number of CPU cycles used on the address stored in r0 (register 0).
The code in fw.hex served only as an example of usage. It did not contain any secret information. I should therefore read the secret memory area from the remote micro-controller.
Intuitively, this approach appeared the easiest way to retrieve the secret memory area. But unfortunately it could display all memory except the secret area. In fact , the following code in kernel prevented all print on the secret memory area and displayed an error message [ERROR] print unallowed memory. This approach was not feasible. The kernel checked for each address if it was outside the secret memory (< 0xf000 or > 0xfc00), as shown in line 16, 19 and 20. If the asked address was in the secret area, print operation would be stopped.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
...
// print, read memory
0xE6: r14 = r0 & r0 //r14=0xfe86
0xE8: mov h13 0xFC
0xEA: mov l13 0x0 //r13=0xfc00
0xEC: mov h12 0xF0
0xEE: mov l12 0x0 //r12=0xf000
0xF0: r8 = r8 ^ r8 //r8=0
0xF2: r9 = r8 & r8 //r9=0
0xF4: mov h10 0x0
0xF6: mov l10 0x1 //r10=0x1
0xF8: r11 = r11 ^ r11 //r11=0
0xFA: r1 = r1 & r1 //r1=0xe
0xFC: jz 0x1A
0xFE: r9 = r14 + r8 //r8=count
0x100: r9 = r9 - r12
0x102: js 0x8 //if r9 < 0xf000, jmp 0x10c
0x104: r9 = r14 + r8
0x106: r9 = r9 - r13
0x108: jns 0x2 //if r9 > 0xfc00, jmp 0x10c
0x10A: js 0xE //if r9 < 0xfc00, jmp 0x11A
0x10C: r9 = r9 ^ r9 //r9=0
0x10E: r9 = [r14+r8] //load char
0x110: [r13:r11] = r9 //store char
0x112: r8 = r8 + r10 //r8+=1
0x114: r1 = r1 - r10 //r-=1
0x116: jnz 0xE2
0x118: ret 0xF
...
Direct access to the secret area, generated an error Memory access violation. I needed to understand how its kernel worked. Since I could dump the micro-controller’s kernel memory area, I just used the previous disassembler on the kernel code. The following code shows the interesting kernel entry point. It first checked whether the syscall number was one of 1, 2, 3 (line 1 to 6). If not, it would either quit or return undefined system call error. Then it calculated the syscall offset (#syscal - 1) * 2 (line 8, 9, 10). And it added 0xF000 to this offset (line 12, 13) that was indeed the pointer to the syscall function. Finally, it called another function to load the value from the syscall function pointer (line 14, 23 to 32) and jump to it (line 15).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
0x0: r0 = r0 & r0
0x2: jz 0x6C //if r0 ==0, jz 0x70($PC+0x6c+0x2), system reset
0x4: mov h1 0x0
0x6: mov l1 0x3 //r1=3
0x8: r2 = r1 - r0 //r2=r1-r2
0xA: js 0x12 //if r1>3, js 0x1E($PC+0x12+0x2), undifined system call
0xC: mov h2 0x0
0xE: mov l2 0x2 //r2=2, (r1=1,2,3)
0x10: r1 = r0 * r2 //r1=r0*2
0x12: r1 = r1 - r2 //r1=(r0-1)*2
0x14: mov h0 0xF0
0x16: mov l0 0x0 //r0=0xF000
0x18: r0 = r0 + r1 //r0=0xF000+r1
0x1A: call 0x94 //call 0xB0($PC+0x94+0x2), r0=[r0]
0x1C: ret 0x0 //ret r0
; undifined system call
0x1E: mov h1 0x0
0x20: mov l1 0x2B //r1=0x2b, size
0x22: mov h0 0xFE
0x24: mov l0 0x5A //r0=0xFE5A, string pointer
...
;
0xB0: mov h1 0x0 //r1=0x1
0xB2: mov l1 0x1
0xB4: mov h2 0x1
0xB6: mov l2 0x0 //r2=0x100
0xB8: r3 = [r0+r1] //load [r0+1]
0xBA: r1 = r1 - r1
0xBC: r4 = [r0+r1] //load [r0]
0xBE: r4 = r4 * r2 //r4=[r0]*0x100
0xC0: r0 = r3 ^ r4 //r0=([r0]<<8)^[r0+1]
0xC2: ret 0xF //ret r15
...
So there was a jump table (size 3) at the address 0xF000. Remember that there was also a syscall that allowed to write the nember of CPU cycles used to an aribitary address. The solution could be overwrite one syscall function pointer and call again that syscall. The following paragraph presents the result returned by the remote micro-controller. The code execution was still in kernel mode and there was an exception at address 0x5868. Because I’ve overwritten the syscall function pointer by the number of CPU cycles used (0x5686) and there was no code at this address.
Now I could read the secret memory area with the following exploit: overwirte one syscall function pointer, write my read memory code at the overwritten address and call again that syscall. The following Python code takes an offset to the address 0xF000 as an augment and allows to read 5 bytes per time.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
$ cat exploit.py
import parser
import socket
import select
import sys
def reada(addr, l):
h00 = (addr & 0xff00) >> 8
l00 = addr & 0x00ff
h01 = (l & 0xff00) >> 8
l01 = l & 0x00ff
return [ 0x20, h00, 0x10, l00, 0x21, h01, 0x11, l01, 0xc8, 0x2 ]
def writea(addr, w=0):
h00 = (addr & 0xff00) >> 8
l00 = addr & 0x00ff
return [ 0x20, h00, 0x10, l00, 0xc8, 0x3 ]
def storeatoffset(offset, w=0):
h00 = offset >> 8
l00 = offset & 0xff
h05 = w >> 8
l05 = w & 0xff
return [ 0x21, h00, 0x11, l00, 0x25, h05, 0x15, l05, 0xf5, 0x1 ]
if __name__ == "__main__":
if len(sys.argv) != 2:
print("Usage python exploit.py offset_2_0xf000")
sys.exit(0)
payload = b''
with open('fw.hex', 'wt') as output:
cmd = parser.Cmd()
cmd.payload = [0x20, 0x58, 0x10, 0x4a]
shellcode = []
a = int(sys.argv[1])*5
shellcode += [0x20, (0xf000+a)>>8, 0x10, (0xf000+a)&0xff, 0xe8, 0x1]
a += 1
shellcode += [0x20, (0xf000+a)>>8, 0x10, (0xf000+a)&0xff, 0xe9, 0x1]
a += 1
shellcode += [0x20, (0xf000+a)>>8, 0x10, (0xf000+a)&0xff, 0xea, 0x1]
a += 1
shellcode += [0x20, (0xf000+a)>>8, 0x10, (0xf000+a)&0xff, 0xeb, 0x1]
a += 1
shellcode += [0x20, (0xf000+a)>>8, 0x10, (0xf000+a)&0xff, 0xec, 0x1]
for i in range(len(shellcode)):
cmd.payload += storeatoffset(i, shellcode[i])
for i in range(0xf003,0xf005):
cmd.payload+= writea(i, 0x100)
cmd.nb_bytes = len(cmd.payload)
output.write(cmd.gencmd())
output.write(cmd.getend(1))
After tried some offset values, I finally got the email address 66a65dc050ec0c84cf1dd5b3bbb75c8c@challenge.sstic.org!