Game Development Reference
In-Depth Information
4000000
3500000
3000000
2500000
Memory (Uniform Grid)
Memory (Sliced Grid)
2000000
Memory (Octree)
1500000
1000000
500000
0
1 3 5 7 9 1113151719212325272931333537394143454749515355575961
Time step
Figure 7.8. Memory consumptions when using uniform grid, sliced grid, and octree (see
Color Plate V).
7.4.3
Introducing Sort
Introduction of the sliced grid not only improves the memory efficiency but also
improves the performance thanks to the dense voxel data,which will be shown
later. This section discusses how to improve the performance from the perspec-
tive of cache efficiency. A simple implementation of a particle-based simula-
tion is accompanied by random access of the data. The random-access pattern
of the memory reduces the performance. If the particle data are also arranged in
the order of the spatial distribution of particles, the cache hit-rate of accessing
the particle data increases as well. However, because particles not having any
fixed connectivity move freely, the memory location of spatially close particles
becomes random as the simulation proceeds. This reduces the memory locality
and results in the slowdown of a simulation. An idea to improve the simulation
performance is to sort the particle data by the spatial order of particles. We have
to be careful in the selection of the sort algorithm used, especially for real-time
applications, because the speedup from the ordering of the particle data has to be
greater than the cost of the sorting. Otherwise, it just slows the simulation down.
Researchers have been studying sorting on the GPU. However, the best algo-
rithm for sequential processors is not always the best for parallel processors. For
example, quick sort, which is one of the most efficient sorts on the CPU, does
 
Search Nedrilad ::




Custom Search