« Nevermind This Post | Main | Good-bye Mr. Gobbles »

Decompiling AS3 and Random Compiler Thoughts

When the Flex compiler was open sourced, I wanted to get my hands dirty and make a change or two myself. In the first week I dug into the source and looked to make a change to decompile (or rather disassemble) AS3. Of course, after I'd dug into things I realized that Adobe has already done this. And someone created an AIR app. And a commercial decompiler. This is what I get when I don't keep up with my blog reading.

While I didn't end up making any changes, I'm going to give a partial explanation of this code.

The entry-point for the AS3 disassembler is flash.swf.tools.SwfxPrinter. If you take a look at this class, you'll see a method for each tag that needs to be handled, with each method taking in a Tag class. These correspond to the different tags in the SWF format. The one that is of interest for AS3 decompiling is DoABC. (To be pedantic, we actually use the DoABC2 version of this tag, which is a bit confusing, but the original DoABC tag was never released.) This is the tag which contains ABC, the compiled AS3 that the Flash Player runs. Going back to SwfxPrinter, you'll see the DoABC method takes this bytecode and through the magic of AbcPrinter prints out all the operations.

I assumed that the SwfxPrinter would work by parsing Nodes and not directly work from the ABC. As explained above, the ABC is what's actually in the SWF, and it's contained within a DoABC tag. The Nodes are an abstraction of the bytecode that the AS3 compiler uses. It's what is generated within the compiler when you first compile an application, but there's also a way to get Nodes from a finished compilation. You can see this in action within the SwfxPrinter.doABC() tag. Within the showActions code, you can see how the ABC is turned into the top-level Node, a ProgramNode, which then can be printed out via the PrettyPrinter. I tried enabling this code myself, and it doesn't work very well. There's a lot of missing pieces to the output, so apparently everyone should keep using the existing path of parsing the ABC bytecode for printing.

That's a little more insight into the class used for AS3 disassembling, but it doesn't even touch the real work in decoding the ABC format, TagHandlers, etc. There's always plenty more to learn in the compiler, and hopefully I'll be able to blog more in the future about the parts that I know.

We were talking about the compiler and ideas for improvements at the last Boston Flash Design Patterns Group, where we stopped talking about design patterns for a week and instead talked about the Flex compiler and the need for more comments. People wanted to know the easiest places to make some changes, and I couldn't come up with much. There's some jumping-off points in flex2.tools.Compiler or flex2.tools.Compc. Changing a Transcoder could be interesting, or adding one to the list in flex2.tools.API.getTranscoders(). If you'd like to try something for fun, you could always alter Grammar.jj to change the core of MXML. Daniel Rinehart is looking at making a change to the compiler, and it's going to be interesting to see all the changes coming in from around the community.

Update: See Random Compiler Thoughts, Part II for more info on the compiler.

Comments (4)

laan:

hehe, greate! About Nemo404,I can find the download url of the air appliction source file.So, do you have?

I don't have the source. I'd ask the creator of Nemo 440 about this.

I've written an open-source tool which can disassemble and assemble back AS3 code, even for obfuscated applications. You can find it here: http://github.com/CyberShadow/RABCDAsm

Thanks for the comment on this, CyberShadow, looks like a great project.