Here is the SOS DP implementation:
// dp[mask] starts as f[mask]
for mask from 0 to (1 << n) - 1
dp[mask] := f[mask]
// Process one bit at a time
for i from 0 to n - 1
for mask from 0 to (1 << n) - 1
if (mask >> i) & 1 = 1 then
dp[mask] := dp[mask] + dp[mask xor (1 << i)]
Three loops, and you're done. The outer loop picks a bit position. The inner loop checks every mask. If bit is set in that mask, you add the contribution from the mask with bit turned off. After all rounds, dp[mask] holds the sum of f[submask] over every submask of mask.
Why does iterating masks in increasing order work? When you read dp[mask xor (1 << i)], that mask has bit off. It was not updated in this round (the if skipped it), so it still holds the value from the previous round. This is exactly what the recurrence needs.
Time complexity: .
Space complexity: .