I am confused by how the parallel prefix adder (the one described in this presentation: https://users.encs.concordia.ca/~asim/COEN_6501/Lecture_Notes/Parallel%20prefix%20adders%20presentation.pdf) speeds up carry generation (specifically the Ladner-Fischner adder architecture). I understand how carry generation in a non-parallel CLA (carry-look ahead) adder works—carry bits are computed using just the initial carry bit ($C_0$) and propagators and generators:

$C_{i} = G_{i – 1} + P_{i – 1}G_{i – 2} + P_{i – 1}P_{i – 2}G_{i – 3} + … + P_{3}P_{2}P_{1}P_{0}C_{0}$

But how exactly are carry bits computed in a parallel fashion for a parallel adder? I am having trouble with the grouping. I know that in a typical arithmetic expression like the one below:

$a + b + c + d$

We can group in the parallel fashion:

$(a + b) + (c + d)$

But how are the terms in the $C_i$ formula grouped? Especially since the first step of the adder is to generate all the individual $P_{i}$ and $G_{i}$ units but the terms in the $C_i$ have a lot of $P_{i}$ and $G_{i}$ combined together.