Game Development Reference
In-Depth Information
uint projectToGrid(vec3 baryCoords, float shadingRate){
vec3 vpos = coords.x * vpos0 + coords.y * vpos1 + coords.z * vpos2;
vec2 screenPos = projToScreen(vpos);
ivec2 gridPos = ivec2(screenPos * shadingRate + vec2(0.5f)) - domain
.xy;
return uint(domain.z * gridPos.y + gridPos.x);
}
Listing 3.3. The decoupling map is a simple projection to a regular grid. The density
of this grid is determined by the shading rate.
The method projectToGrid assigns the fragment to a shading sample, as we
show in Listing 3.3. The local index of the shading sample is the linearized index
of the closest shading grid cell to the visibility sample. Later, when the shading
data is interpolated, some shading samples might fall outside the triangle. These
are snapped to the edges (by clamping the barycentrics to 0 or 1, respectively),
otherwise some shading values would be extrapolated.
The computation of the texture mip levels also needs special attention. Nor-
mally, this is done by the hardware, generating texture gradients of 2
2 fragment
blocks. Depending on the shading rate, the shading space gradients can be differ-
ent. For example, a shading rate of 0.5 would mean that 2
×
2 fragments might
use the same shading sample, which would be detected (incorrectly) as the most
detailed mip level by the hardware.
×
Therefore we manually compute the mip
level, using the textureGrad function.
In Listing 3.2 we have also tried to minimize the divergence of fragment shader
threads. The method getCachedAddress returns the location of a shading sample
in the global memory. In case of a cache miss, a new slot is reserved in the CG-
buffer (see below), but the shading data is only written later, if the needStore
boolean was set.
3.4.3
Global Shading Cache
For a moment let us consider the cache as a “black box” and focus on the im-
plementation of the CG-buffer. If a shading sample is not found in the cache,
we need to append a new entry to the compact linear buffers, as shown in Fig-
ure 3.4. The CG-buffer linearly grows as more samples are being stored. We can
implement this behavior using an atomic counter that references the last shading
data element:
address = int (atomicCounterIncrement(bufferTail));
The streaming nature of the GPU suggests that even a simple first-in, first-out
(FIFO) cache could be quite ecient as only the recently touched shading samples
 
Search Nedrilad ::




Custom Search