Game Development Reference
Figure 3.2. Parallel computation of rigid-body simulation.
If stages are divided into independent tasks properly, we can execute tasks
run on multiple SPUs. The PPU performs preprocessing to assign data to each
task and postprocessing to summarize the results, and tasks are executed on SPUs
in parallel. It is important that each SPU take responsibility for only its given
work, and not care about work processed by other SPUs. By doing this, it is not
necessary to implement complicated mechanisms to synchronize between tasks.
In general, such a complicated synchronizationmechanisms can easily cause bugs
or performance problems.
It would be optimal if we could move all stages into parallelized tasks. But
usually, there is a task that can't be parallelized, or at least there is no effect if
it is parallelized. In such a case, it is still useful to move as much PPU work as
possible to the SPU because the PPU will become overloaded with work that can't
be processed by the SPU.
Figure 3.3. Suitable structure for DMA.