Game Development Reference
In-Depth Information
__global__ void jacobi_step( gpu_plm2<float4> dst ) {
const int ix = blockDim.x * blockIdx.x + threadIdx.x;
const int iy = blockDim.y * blockIdx.y + threadIdx.y;
if (ix >= dst.w || iy >= dst.h) return ;
float4 st = tex2D (texST, ix, iy);
if (st.w < 1) {
st = make_float4 ((
make_float3 ( tex2D (texST, ix+1, iy
)) +
make_float3 ( tex2D (texST, ix-1, iy
)) +
make_float3 ( tex2D (texST, ix,
iy+1)) +
make_float3 ( tex2D (texST, ix,
iy-1))) / 4,
0);
}
dst(ix, iy) = st;
}
Listing 5.2. Implementation of a Jacobi relaxation step.
Jacobi iterations are effective for removing high-frequency oscillations in the
residual, but they perform rather poorly when the residual becomes smooth.
Multigrid methods [Briggs et al. 00] address this issue by solving for the residual
on a coarser level. A similar approach, which can be regarded as a simplified
variant of a multigrid solver, where computation of the residual is avoided, is
adopted here. As a first step, which structure tensors should be kept unmodified
is determined using Equation (5.4). To this end, the fourth color channel is used,
with one indicating a boundary pixel and zero used otherwise.
The reason why convergence of the Jacobi method is slow is illustrated in Fig-
consequently, it takes a large number of iterations until values on the boundary
diffuse into the inner parts. Apparently, a simple way to speed up the diffusion
is to compute it on a coarser grid. Since the transfer to a coarser grid can be
repeated recursively, this yields a pyramid of images. Moving from a finer to a
coarser level is referred to as restriction . The pixels on a coarser pyramid level are
defined as the average of four pixels on the finer pyramid level, with nonboundary
pixels being excluded ( Figure 5.7(a) ) . The left of Figure 5.6(b) exemplifies the
pyramid construction. Once the finest pyramid level is reached, the pyramid is
processed in a coarse-to-fine manner. On each pyramid level, one to three Jacobi
iterations are performed. Nonboundary pixels on the next-finer pyramid level are
then replaced by sampling the result using bilinear interpolation. These opera-
tions are repeated until the finest pyramid level has been reached, as shown on the
right of Figure 5.6(b) . The implementations of the restriction and interpolation
operations are shown in Listing 5.3 and 5.4.
Nevertheless, performing the relaxation for every computation of the structure
tensor is expensive. Therefore, the relaxation is only performed for the first
computation of the structure tensor. All subsequent computations substitute the
structure tensor of the previous computation.