Game Development Reference
In-Depth Information
void Dispatch()
{
...
d3dContext->Dispatch(numBlocksY_x , numBlocksY_y , 1);
...
d3dContext->Dispatch(numBlocksCb_x , numBlocksCb_y , 1);
...
d3dContext->Dispatch(numBlocksCr_x , numBlocksCr_y , 1);
}
Listing 2.2. Dispatching of thread groups.
per x -and y -dimension, as the main HLSL function in Listing 2.3 shows. The
group thread count is a multiple of 32, which is recommended to maximize hard-
ware occupancy [Bilodeau 11, Fung 10]. The number of spawned thread groups
equals the precalculated computation dimensions described in Section 2.2.2. Af-
[numthreads(8, 8, 1)]
void ComputeJPEG(
uint3 DispatchThreadID
: SV_DispatchThreadID,
uint3 GroupThreadID
: SV_GroupThreadID ,
uint3 GroupID
: SV_GroupID,
uint GroupIndex
: SV_GroupIndex)
{
InitSharedMemory(GroupIndex);
//RGB -> YCbCr component and level shift
ComputeColorTransform(GroupIndex, DispatchThreadID);
//Apply forward discrete cosine transform.
ComputeFDCT(GroupIndex, GroupThreadID);
//Quantize DCT coefficients.
ComputeQuantization(GroupIndex);
//Move nonzero quantized values to
//beginning of shared memory array
//to be able to calculate preceding zeros.
StreamCompactQuantizedData(GroupIndex);
//Initiate bitstrings, calculate number of
//bits to occupy, and identify if thread represents EOB.
BSResult result = BuildBitStrings(GroupIndex);
//Do entropy coding to shared memory
//using atomic operation.
EntropyCodeAC(GroupIndex, result);
//Move result from shared memory to device memory.
CopyToDeviceMemory(GroupIndex, GroupID, result);
}
Listing 2.3. Main compute shader function.
 
 
Search Nedrilad ::




Custom Search