Notes from Simple and Efficient Construction of Static Single Assignment Form
The whitepaper describes an algorithm that constructs an SSA form. It is simple and efficient :D
In case you have the entire CFG at once, the paper has a trivial algorithm:
First step is to visit all BB in RPO and do local value numbering. This does most of the work. What remains are harder cases where you create a variable in one BB and then use it in another. While doing the LVN, when you see a use of a variable that wasn’t defined in this BB, you start a recursive lookup.
The recursive lookup must start with placing an empty phi function as the definition of the sought-after variable at the start of the current BB. This prevents infinite recursion when dealing with loops. Next, do a lookup in all immediate predecessors of the current BB and merge their results in the already created phi function.
This is it! There is a small detail omitted that whenever you finish creating a new phi, you run a simple trivial-phi-detector-and-remover. This checks for trivial cases such as a phi with a single operand and removes them recursively.
There is a second tier to this algorithm suitable in cases where BBs can slowly appear in some order and you want to calculate SSA continuously. This is useful when you have a parser that is emitting basic blocks and you want to construct SSA on the fly.
Only thing you need to know is when a BB gets ‘sealed’ – that means that no further immediate predecessors will be added to it ever.
The modification to the previous algorithm is simple: before the recursive lookup tries recursing into predecessors, it checks if the current BB is sealed. If it is not sealed, it cannot continue since another BB could appear later and invalidate whatever calculation was going to happen. The solution is to defer the calculation. We place a dummy phi function with no operands as a sentinel for future lookups going through this path and place a note that we have to come back here and finish the lookup once this BB is sealed.
The note is implemented as a dictionary from basic blocks to these dummy phis. When you get a signal from the parser that a BB is sealed, you iterate over unfinished lookups in this dictionary and finish them.