In case I’m not the only one trying to do this, I thought I’d write this up.
Given: a build that produces many SWFs, a source tree with lots of dependencies, not all of which are obvious, a revision control system from which to get a history of your source. You wish to write a script that correctly predicts which SWFs will change between two revisions of the source. You want to test that script by actually comparing the binaries to see if the predictions were correct.
The first thing you’ll note is that doing a binary compare of the SWFs, or a text compare of the swfdumps, does not answer the question. There’s a timestamp, and if debugging is on, there’s a debugger password. But much worse, internal ID numbers and ordering of classes and chunks of code changes semi-randomly. Do two builds of precisely the same source, and unless you write a lot of fancy code, you can’t tell whether the SWFs are really the same.
So, go grab the Flex SDK source and modify it with the following two bits of sed:
[bash]
sed -i ‘s/bHashSet/LinkedHashSet/g’ `find -name *.java`
sed -i ‘s/bHashMap/LinkedHashMap/g’ `find -name *.java`
[/bash]
In other words, replace the Java collections which have non-deterministic iterators with equivalents that have deterministic iterators.
Your new compiler produces pretty deterministic output. You still have to deal with the timestamp and debug password. One way to do that is to [bash]swfdump -abc[/bash] the binaries and compare the dumps textually, ignoring those few lines that always change. Another approach is to modify SimpleMovie.java to remove the code conditional on configuration.generateDebugTags(), and set the last argument in the ProductInfo constructor (the timestamp) to 0. Then the binaries should be bitwise identical.