I am confused by how the parallel prefix adder (the one described in this presentation: https://users.encs.concordia.ca/~asim/COEN_6501/Lecture_Notes/Parallel%20prefix%20adders%20presentation.pdf) speeds up carry generation (specifically the Ladner-Fischner adder architecture). I understand how carry generation in a non-parallel CLA (carry-look ahead) adder works—carry bits are computed using just the initial carry bit ($C_0$) and propagators and generators:
$C_{i} = G_{i – 1} + P_{i – 1}G_{i – 2} + P_{i – 1}P_{i – 2}G_{i – 3} + … + P_{3}P_{2}P_{1}P_{0}C_{0}$
But how exactly are carry bits computed in a parallel fashion for a parallel adder? I am having trouble with the grouping. I know that in a typical arithmetic expression like the one below:
$a + b + c + d$
We can group in the parallel fashion:
$(a + b) + (c + d)$
But how are the terms in the $C_i$ formula grouped? Especially since the first step of the adder is to generate all the individual $P_{i}$ and $G_{i}$ units but the terms in the $C_i$ have a lot of $P_{i}$ and $G_{i}$ combined together.