Game Development Reference
In-Depth Information
Figure 4.5. An edge with extremely high dynamic range contrast. Places like this
naturally draw the attention of the human eye, thus the reconstruction filter needs to
deliver best results. The edge-directed filter prevents any chrominance artifacts even
in these challenging cases. This particular example combines our compact format with
8 ×
MSAA, HDR render targets, and tone mapping.
4.4.1
Optimizations
A GLSL or HLSL implementation of our method has to perform five fetches,
the actual pixel under consideration and four of its neighbors, in order to feed
the edge-directed filter of Listing 4.2. Most of these fetches will come from the
texture cache, which is very ecient in most GPUs, thus the overhead should
be rather small on most architectures. It is worth noting that a GPGPU imple-
mentation can completely avoid the redundant fetches, by leveraging the local
shared memory of the ALUs. Furthermore, newer architectures, like Nvidia's
Kepler, provide intra-warp data exchange instructions, such as SHFL, that can
be used to exchange data between threads in the same warp, without touching
the shared memory. Nevertheless, since GPGPU capabilities are not available on
all platforms, it is very interesting to investigate how the number of fetches can
be reduced on a traditional shading language implementation. We focus here on
the reduction of memory fetches, instead of ALU instructions, since for years the
computational power of graphics hardware has grown at a faster rate than the
available memory bandwidth [Owens 05], and this trend will likely continue in
the future.
 
 
Search Nedrilad ::




Custom Search