Game Development Reference

In-Depth Information

successively tests each nonempty subset
X
of
Y
until it finds one for which (a)

and (b) hold.

In theory, we do not need to test all subsimplices, only the ones containing the

most recently added vertex [van den Bergen 99, van den Bergen 03]. However,

in practice, it often occurs that due to rounding errors, a termination condition

of GJK is not met and the last-added vertex is thrown off. So, for the sake of

robustness, it is better to check all subsimplices.

Johnson's algorithmhas been heavily debated in the game physics community.

Admittedly, the formulation is not very intuitive, which may explain why it tends

to scare implementors away and makes them look for a solution in more geomet-

rically familiar concepts. This chapter attempts to explain Johnson's algorithm

in geometric terms in order to encourage the reader to stay with this formulation.

The formulation has a number of advantages over ad hoc solutions for finding the

closest point of a simplex.

First of all, the formulation has a lot of opportunities for reuse of intermediate

computations. The Δ
i
values for lower-dimensional subsimplices can be used

for computing the values for the higher-dimensional subsimplices. Reuse of these

values extends over iterations, since each new simplex is based on the previous

one.

Contrary to popular belief [Muratori 06], an abundant amount of reuse from

earlier iterations is possible, even if only the subsimplices that contain the last-

added support point are tested. Despite the increased importance of data cache

coherence and the gain in raw computing power in modern CPU architectures,

reuse of earlier computations may still be worthwhile. A caching scheme for the

Δ
i
values as suggested in [van den Bergen 99] requires 512 bytes of storage

using double precision and can easily fit in an L1 data cache. Of course, the

problem is keeping it there. Timing when to prefetch the data is tricky: too early

will get the data trashed back to memory, too late will stall the read.

However, better performance is not our only concern. Caching of intermediate

values may result in loss of precision. In the x86 architecture, floating-point regis-

ters are 80 bits wide, whereas a double-precision floating-point variable is usually

64 bits wide. Saved values from previous iterations will be rounded down unless

they are stored in a type that is wide enough. Unfortunately, not all compilers

support storing floating-point numbers at 80-bit precision [Wikipedia 10a].

All things considered, reuse of computations from earlier iterations takes quite

some effort to make it pay off. However, reusing computations within a single

iteration is much easier and offers a substantial performance gain. Here, the in-

termediate values do not need to go through memory and often can be kept in

floating-point registers. The recursive formulation can be exploited successfully