Game Development Reference
In-Depth Information
Chroma subsampling mode
4:4:4
Y 0
C b0
C r0
Y 1
C b1
C r1
... N
4:2:2
Y 0
Y 1
C b0
C r0
Y 2
Y 3
... N
4:2:0
Y 0
Y 1
Y 2
Y 3
C b0
C r0
... N
Copy GPU result to CPU
CPU
JFIF header
Final JPEG data
EOI
Figure 2.7. GPU to CPU process showing thread group output order based on chroma
subsampling mode.
2.3
Performance
Performance tests were performed by compressing back-buffer data into JPEG
data. Two source images, of dimensions 2 , 268
1 , 512 pixels, were encoded with
different settings using the presented technique and libjpeg-turbo version 1.2.0. 1
The back buffer has the relevant flags to be treated as a shader resource. The
source Texture2D resource is created by loadingRGBdatafromafile. This
texture is mapped to a full screen quad, which is rendered to the back buffer
in each test run. For libjpeg-turbo tests, back-buffer data is copied to a CPU-
accessible staging resource and thereafter encoded. The DirectCompute tests are
done by sampling back-buffer data directly. Resulting JPEG data size was always
smaller when encoding using the DirectCompute encoder. Encoded images, with
chroma subsampling disabled and JPEG quality 100, were decoded and thereafter
compared to source-image data. The comparison showed that the decoded version
was almost identical to the source image; details are presented in the following
sections for each benchmark scenario.
The 32-bit test application was executed using a computer equipped with
Microsoft Windows 7 Professional x64, Intel i7 860 CPU at 2.8 GHz, 8 GB RAM,
and AMD Radeon 7970 GPU. Each run was executed 50 times, in which each
run was performance timed. The test results show that the presented technique
outperforms libjpeg-turbo in all test cases.
×
1 libjpeg-turbo 1.2.0 uses SIMD instructions to encode baseline JPEG image data and is one
of the fastest CPU encoders available today.
 
Search Nedrilad ::




Custom Search