Game Development Reference
In-Depth Information
for performing stream compaction in parallel, which is used in the counting pro-
cess [Harris et al. 07]. First the DC coecient and all nonzero AC coecients are
flagged and copied to a separate array in shared memory. The collected values
are accumulated by computing an exclusive scan. Each nonzero AC coecient
is, together with its current index position, copied to a new array where the des-
tination index position is equal to the corresponding scan result value; see the
HLSL implementation in Listing 2.7.
Build bit strings. After stream compaction, entropy bit strings are constructed.
Here each thread group is responsible for generating a bit stream of encoded AC
coecients that complies to the JPEG standard. Each thread identifies which, if
any, bit strings to construct and append to the thread-group bit stream. Threads
also keep track of the total bit count of all constructed bit strings. Bit count is
retrieved by using the Shader Model 5.0 intrinsic function firstbithigh . Total
bit count is used when calculating output positions, to calculate where bit strings
should be concatenated. The bit-string construction HLSL code is in Listing 2.8.
typedef int BitString; // <-- numbits stored in high 16 bits
struct BSResult
int NumEntropyBits;
BitString BS[6];
BSResult BuildBitStrings(uint GroupIndex)
BSResult result = (BSResult)0;
static const uint mask[] = {1,2,4,8,16,32,64,128,256,
//special marker symbols
BitString M_16Z = AC_Huffman[0xF0];
BitString M_EOB = AC_Huffman[0x00];
if (GroupIndex == 0 && RemappedValues[0] == 0)
result.BS[5] = M_EOB;
else if (RemappedValues[GroupIndex] != 0)
uint PrecedingZeros = (PrevIndex[GroupIndex] -
PrevIndex[GroupIndex -1] - 1);
//Append 16 zeros markers.
for ( int i = 0; i < PrecedingZeros / 16; i++)
result.BS[i] = M_16Z;
int tmp = RemappedValues[GroupIndex];
//Get number of bits to represent number.
uint nbits = firstbithigh(abs(tmp)) + 1;
Search Nedrilad ::

Custom Search