Game Development Reference
void CopyToDeviceMemory(uint GroupIndex, uint3 GroupID, BSResult result)
uint outIndex = GetOutputIndex(GroupIndex, GroupID);
if (GroupIndex > 0 && GroupIndex < EntropyBlockSize -1)
EntropyOut[outIndex] = ConvertEndian(EntropyResult[GroupIndex -1]);
else if (GroupIndex == 0)
EntropyOut[outIndex] = QuantizedComponents;
else if (GroupIndex == 63)
EntropyOut[outIndex - 64 + EntropyBlockSize] = ScanArray;
Listing 2.10. Copying group result to device memory.
Copy to device memory. Each thread that has valid output data is responsible
for copying that data to device memory, as shown in Listing 2.10. Endianness
conversion is applied to optimize the final CPU coding step where each byte is
treated separately. Block data copied to the output buffer is ordered in a pattern
blocks are copied to device memory at different chroma subsampling modes.
Final CPU coding. After GPU computation, the data is copied to a staging buffer
resource for final CPU processing. Each GPU generated block is processed by
coding delta DC values and appending to the final JPEG bit stream, as described
in Section 2.1.6. No modifications are done to already entropy-coded AC data.
The CPU process is briefly illustrated in Figure 2.7. The GPU is not suited to do
this concatenation because of challenges with delta DC calculations and special
cases when an appended byte equals 0xFF. Every 0xFF byte has to be directly
followed by a 0x00 byte, otherwise it would be confused with a JPEG marker
symbol. After all blocks have been appended to the stream, an end of image
(EOI) marker is appended and the JPEG data is complete.
The CPU is also creating the corresponding JFIF header, which is used when
decoding generated baseline JPEG data [Hamilton 92]. The following, relevant
encoding data is stored in the JFIF header structure:
chroma subsampling factors,
number of color components,
sample precision (8 bits).