Game Development Reference
In-Depth Information
v[3]=projToView(8*GET_GROUP_IDX , 8*(GET_GROUP_IDY+1),1.f) );
float4 o = make_float4(0.f,0.f,0.f,0.f);
for ( int i=0; i<4; i++)
frustum[i] = createEquation( o, v[i], v[(i+1)&3] );
}
projToView() is a function that takes screen-space pixel indices and depth value
and returns coordinates in view space. createEquation() creates a plane equation
from three vertex positions.
The frustum at this point has infinite length in the depth direction; however,
we can clip the frustum by using the maximum and minimum depth values of the
pixels in the tile. To obtain the depth extent, a thread first reads the depth value
of the assigned pixel from the depth buffer, which is created in the depth prepass.
Then it is converted to the coordinate in view space. To select the maximum and
minimum values among threads in a group, we used atomic operations to shared
memory. We cannot use this feature if we do not launch a thread group for
computation of a tile.
float depth = depthIn.Load(
uint3(GET_GLOBAL_IDX ,GET_GLOBAL_IDY ,0) );
float4 viewPos = projToView(GET_GLOBAL_IDX , GET_GLOBAL_IDY ,
depth);
int lIdx = GET_LOCAL_IDX + GET_LOCAL_IDY*8;
{ // calculate bound
if ( lIdx == 0 ) // initialize
{
ldsZMax = 0;
ldsZMin = 0xffffffff;
}
GroupMemoryBarrierWithGroupSync();
u32 z = asuint( viewPos.z );
if ( depth != 1.f )
{
AtomMax( ldsZMax, z );
AtomMin( ldsZMin, z );
}
GroupMemoryBarrierWithGroupSync();
maxZ = asfloat( ldsZMax );
minZ = asfloat( ldsZMin );
}
ldsZMax and ldsZMin store maximum and minimum z coordinates, which are
bounds of a frustum in the z direction, in shared memory. Once a frustum is
constructed, we are ready to go through all the lights in the scene. Because there
are several threads executed per tile, we can cull several lights at the same time.
We used 8
8 for the size of a thread group; thus, 64 lights are processed in
parallel. The code for the test is as follows:
×
 
Search Nedrilad ::




Custom Search